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combined, a precipitate of protamine zinc insulin is produced. Sus- 
pensions of this type have been employed in the management of diabetes 
and have advantage over unmodified insulin in that they control the 
patient’s blood sugar level over a relatively long period of time. 
Recently, a new type of long acting insulin, designated as NPH Insulin, 
has become available. NPH Insulin is a suspension of protamine zinc 
insulin such that the protamine content will not be less than, nor more 
than ten percent greater than, the quantity required for the isophane- 
ratio. The isophane-ratio is that ratio of insulin to protamine which 
results in equivalent amounts of insulin and protamine remaining in the 
supernatant as tested by nephelometric procedures. We were interested 
in determining whether it would be possible to detect a biological dif- 
ference in a preparation in which the protamine content was five percent 
less than the quantity required by the isophane-ratio. 


assayed and approved by two independent laboratories, as required by 
the Food and Drug Administration. For this reason, it is not anticipated 
that routine biological assays will be required to control the normal | 
production of NPH Insulin. 


dard’’) was prepared to contain the isophane-ratio of protamine to 
insulin, Mixture B (the “unknown’’) was prepared to contain five percent 
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AN EXAMPLE OF THE USE OF EXTENDED CROSS-OVER 
DESIGNS IN THE COMPARISON OF NPH 
INSULIN MIXTURES 


JosepH L. CIMINERA AND K. Wo.rFre* 


Research and Quality Control Divisions 
Sharp & Dohme Division of Merck & Company, Inc. 
West Point, Pennsylvania 


INTRODUCTION 


When a solution of protamine and a solution of zinc insulin are 


NPH Insulin is prepared from insulin which has been previously 


EXPERIMENTAL METHOD 


Two NPH Insulin mixtures were prepared; Mixture A (the “‘stan- 


*Present address: Camp Detrick, Frederick, Maryland 
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less protamine than Mixture A. Each mixture contained a concentra- 
tion of insulin equivalent to 40 units per ml.* 

The experimental procedure employed was a minor modification of 
the simple cross-over assay for Globin Zine Insulin Injection, official 
in the U.S. Pharmacopoeia XIV (1). This modification consisted only of 
limiting the observation period to six instead of nine hours and with 
four post-injection bleeding times equally spaced within the six-hour 
period of observation. The real difference between the two mixtures was 
expected to be small. For this reason, it was planned beforehand to 
extend the cross-over to include both a switchback and a double switch- 
back design in an attempt to obtain more evidence upon which to base a 
decision. All three designs were analyzed separately and compared. 

Twenty-two female rabbits, weighing from 2.5 to 3.5 kg, were dis- 
tributed randomly between two equal groups. At weekly intervals, 
according to the scheme shown in Table I, single subcutaneous injections 
of either Mixture A or B were administered in volumes of 0.051 ml as 
measured from a micrometer syringe. 


TABLE I 
INJECTION SCHEDULE 


Period (Date) 
Group 
1 (6-27-50) 2 (7-3-50) 3 (7-10-50) 4 (7-17-50) 
I A B A B 
II B A B A 


Blood samples were obtained from each rabbit at 0 (initial level), 
1.5, 3.0, 4.5, and 6.0 hours after injection. Blood sugar levels were 
determined for each sample. The bleeding times were spaced equally to 
facilitate computation of the results. The raw results for all animals 
at all time periods are presented in Table II. 


COMPUTATIONAL PROCEDURE 


Brandt (2) has thoroughly discussed the analysis of cross-over 
designs. Our data, however, include an additional sub-unit, bleeding 
times, and require an extension of Brandt’s methods. Although Brandt 
discusses the use of covariance, he does not consider the case where 
the concomitant variable is common to all values in the sub-unit. In 


*We are indebted to Dr. Robert J. Westfall, Biochemical Research Dept., Sharp & Dohme, for 
the preparation of these mixtures. The technical assistance of Messrs. J. J. Hogue, H. Maxwell, and 
S. H. Hunter in the performance of the assays is greatly appreciated. 
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this experiment, the initial blood sugar values are necessarily common 
to the blood sugar values for all other bleeding times. 

The physiological relation between time and blood sugar levels, 
following insulin injection, is quite complex and is not adequately 
described by any simple function. Since the objective of the experiment 
was to determine the effect of a difference in protamine content, it was 
felt that the customary pharmacopoeial procedure, which specifies 
that the results be analyzed in terms of the differences in blood sugar 
levels between the two mixtures, should be adopted. The unit selected 
for analysis, therefore, was the difference in blood sugar levels pro- 
duced by the two mixtures in the same group of animals at the various 
test periods. Brandt (2) has shown that in a design of this type the 
blood sugar differences are confounded with the periods X groups inter- 
action when a simple cross-over (2 test periods) is involved, with the 
quadratic component of periods X groups when three test periods are 
involved and with the cubic component of periods X groups when there 
are four test periods. For all but two test periods, where only a simple 
subtraction of Period 1 — Period 2 is necessary, the differences are 
most easily computed by means of polynomial coefficients, as given 
in Table VI and illustrated by example later. 

For the purpose of this study, the only relevant factors that need 
be considered are the nature of the differences between preparations 
and the possible influence of the initial blood sugar values upon sub- 
sequent readings. 


A. Analysis of Two-Period Differences 


Table III lists under “hours after injection” the difference in blood 
sugar level between the first and second test periods at each bleeding 
time. These are readily computed from the raw values in Table II. 
Thus, for rabbit number 1, the difference at three hours after injection is, 


P, — P, = 35 — 52 = -17 


The ‘totals’ column represents the algebraic sum of the differences 
at 1.5, 3.0, 4.5, and 6.0 hours after injection. Thus, for rabbit number 2, 


4-17-25 = 


The remaining columns list the linear, quadratic, and cubic components 
of the blood sugar differences for each animal. These were computed as 
the sums of the products of the differences at each bleeding time and the 
corresponding orthogonal polynomial coefficients for n’ = 4 equally 
spaced bleeding times. The latter may be obtained from Fisher and 
Yates (3) and are reproduced for convenience in Table IV. 
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TABLE III 
DIFFERENCES AND TERMS FOR TWO-PERIOD ANALYSIS 
Rab- Hours after Injection Sums of Products (y) 
bit Totals 
No. |0 =z] 1.5 3.0 4.5 6.0 (y) | Linear | Quad- | Cubic 
ratic 
5| -17| -26| -50| g| —46 
2 -8 4| -17 —25 —30 —68 | —110 16 —10 
3 —13 13 34 —8 4 43 —69 -—9 117 
4 —26 9 —25 —12 —34 —62 | —116 12 —82 
5 —4 13 —17 —26 —38 | —126 12| 
6 13 0 —8 17 31 40 | —118 —44 
7 0 21 —12 —5 -—9 —5 —83 29; —51 
8 —4 27 —8 0; -21 —2| —136 14| —72 
9 9 25 0}; -—30 —34 —39 | —207 21 31 
10 9 9 —23 —4t -—17 —35 —59 19 —83 
ll -9 12; —12 —5 —17 —22 -80 12 —50 
Sub- 
totals —46 138 —96 | —101 | —179 | —238 | —956 156 | —302 
12 13 -9 —8 0 —26 —43 —43 —27 —41 
13 —-9 16 13 17 25 71 31 a -—3 
14 —5 0 —22 21 0 -1 43 1 | —129 
15 0 —5 25 4 20 54} —20 —82 
16 0 —12 —-17 8 —4| -—25 49 —7 —67 
17 0 21 8 —4 —43 —18 | —204| -—26 —28 
18 0 9 5 —12 —38 —36 | -—158 |  -—22 4 
19 0 10 —10 —8 0 —8| —28 28 | —16 
20 —5 0 —4 0 4 0 16 8 —8 
21 0 8 —6 34 47 83 157 27 —81 
22 0 13 5 13 4 35 —19 -1 —33 
Sub- 
totals —6 52 —4l1 94 —27 78 | —102 —28 | —484 
Differ- 
ences 
be- 
tween 
Sub- 
totals —40 86 —55 | —195 | —152 | —316 | —854 184 182 


As an example, the sum of products for the-linear component for rabbit 
number 3 would be derived from Tables III and IV as follows: 


(13)(—3) + (84)(—1) + (—8)(+1) + (4)(+38) = —69 
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TABLE IV 
ORTHOGONAL POLYNOMIAL COEFFICIENTS FOR n’ = 4 


Component Coefficients 
Linear -3 -1 +1 +3 
Quadratic +1 -1 -1 +1 
Cubic -1 +3 —3 +1 


The differences between the sub-totals as shown in the last row of 
Table III represent the differences between the two insulin mixtures 
(Mixture A — Mixture B). The difference between the sub-totals in 
the “totals” column is a measure of the difference in the mean blood sugar 
levels for the two mixtures, while the differences between the sub-totals 
for the ‘sums of products” columns characterize the nature of the 
difference over the four bleeding times, or, in other words, serves as a 
means of comparing the blood sugar curves produced by the two insulin 
mixtures. This will be elaborated on further in the section on inter- 
pretation of results. 

There now remains the problem of testing the mixture differences 
for significance and of determining the effect, if any, of the initial 
blood sugar levels upon the post-injection observations. The present 
official U.S. Pharmacopoeia (1) assays of long acting insulin prepara- 
tions take no account of a possible effect of the initial blood sugar 
level. Earlier investigators (4, 5), however, have shown that this may 
be of considerable importance. Covariance analysis with the initial 
blood sugar level as the concomitant variable was employed in this 
investigation. The analysis is conveniently laid out in the form shown 
in Table V. There are four sources of variation (differences) relevant 
to the mixture comparison, each with a sum of squares representing the 
difference between mixtures and an error term with 19 degrees of 
freedom, after adjusting for covariance. The analysis is conducted 
on a whole-unit basis. 

The sums of squares and products in all four sections are obtained 
from the terms in the second and the last four columns of Table III, 
labeled (x) and (y), respectively. The method of computation is the 
same for all four sections and is illustrated below for the linear term: 


Mixtures = (—40)’/44 = 36.36 


where 44 = (2) (22) = sum of the squares of the coefficients for obtaining 
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the two-period differences times the number of rabbits. The former may 
be obtained directly from Table VI. 


= 813.18 
where 11 = number of rabbits per group 


2 = sum of the squares of the coefficients for obtaining the two- 
period differences and may be obtained directly from Table VI. 


[zy] 
Mixtures = (—40)(—854)/44 = 776.36 
Error 
(—13)(—88)+- --+(0)(—19) — [(—46)(— 956) +(—6)(— 102)]/11 
= —704.73 


Mixtures = (—854)?/44 = 16575.36 


(—88)* + --- + (—19)* — [(—956)* + (—102)*)/11 
2 


Error = 


= 82275.55 


The residual total sum of squares and that for error are obtained in 
the usual way from 


— 
and the adjusted sum of squares for mixtures is obtained as the difference 
of these two residuals. 
The interpretation of this analysis and that for three and four test 
periods is deferred to the section on interpretation of the results. 
The coefficients for obtaining the required “differences” for two, 


three, and four test periods and their sums of squares are shown in 
Table VI. 


B. Analysis for Three-Period Differences 

Tables VII and VIII show the required terms and the analysis for 
three-period “differences”. 

The “differences’’ is columns 2 to 6 are computed from the values in 
Table II and the coefficients for three test periods in Table VI. Thus, 


7 
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TABLE VI 
COEFFICIENTS FOR OBTAINING “DIFFERENCES” 
Test Sums of 
Periods Coefficients Squares 
3 +1, -2, +1 6 
4* +1, —3, +3, -1 20 


*The coefficients for four test periods normally would be —1, +3, —3, +1. The signs were re- 
versed in order to make the results comparable with the two and three test period results. 


for Rabbit 1, at 1.5 hours after injection, the required “difference’’ is 
obtained as follows: 


(+1)(52) + (—2)(47) + (+1)(52) = 10 


The other columns are computed in the same manner as for Table III. 

The method of computation for the analysis of variance is identical 
with that for Table V, except for a change in the divisors. The divisor 
for mixtures becomes 132 (=6 X 22), while the divisor for error be- 
comes 6. The latter is obtained from Table VI. 


C. Analysis for Four-Period Differences 


Tables [IX and X show the required terms and the analysis for four- 
period ‘differences’. Computational procedure is as described above. 


INTERPRETATION OF RESULTS 


The mean difference between the two insulin mixtures for the two- 
period analysis is the simple difference between observations in the 
first and second periods. Using subscripts to denote the period, we have 


A, — B, 
1] 


Bate = mean difference for Group II 


mean difference for Group I 


A, — B, B, — Az 


difference of means 


° 
A, — B, 8B, +A; 
11 
A,+A,_ Bi, +B, 
11 11 
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TABLE VII 
DIFFERENCES AND TERMS FOR THREE-PERIOD ANALYSIS 


Rab- Hours after Injection Sums of Products (y) 
bit Totals 
a No. |O=z/| 1.5 | 3.0 | 4.5 | 6.0 | (y) | Linear | Quad- | Cubic 
ratic 
1 —18 10| -41| -—56| —121 | —205 29| —45 
2 10| -—76| —201 | -283| -31| 129 
3 -4 22 43 18 34] 117 11} 87 
4 —56| -12/ -16 5| -43| -66| -72| -—44| —94 
5 -8 13| -59| —148| —204 70| —38 
6 9 18 51 57| 185| 177) —51 
7 -9 38 | -58/ —107 | —263 63 | —191 
8 5 58 | —16 13 30 85 | —55 91 | —115 
9 5 42| -21| -43| -—34| —56| —250 72| -10 
10 9 39 2 47 1 s9| -69| —173 
i 11 2 29; -33| -17| —106 46 | —142 


12 26; -18| —12 17| -17| 32| -40| —86 
13 7 38 72 109 | 241 49| —73 
14 —22| -13| -—40 34 4| —15 125 —3 | —205 
15 5| -31| -14 58 39 52 | 282| —36| —146 
16 5| -16| -—38 —63 69 23 | —87 
17 —13 21| -8| -—48 —202 -6| —84 
18 —13 22 39 22| -16 67 | —131 | —55 13 
19 -13 10| -38]/ -—82/ —150 26 | —30 
20 —10 9 -8 4 4 17| 
21 0 4} -19 34 60 79 | 221 49 | —103 
22 1 47 22 79 85 | -17| —125 
Sub- 
: totals | —57 4| —136 211 78 157 569 7 | —967 
Differ- 
ences 1 222 | -38| —324| —312| —452|-1888| 272| 324 


The average differences at each of the four bleeding times after 
LA injection are obtained, therefore, from the differences in the last row 
a of Table III after division by 22 (= 11 xX 2). 


+a Similarly, it can be shown that the weighted difference of means for 
the three-period analysis is 
A,+2A,+A;_ B,+2B,+B; 
11 11 


Sub- 
i totals —174 | —113 | —234 | —295 |—1319 279 | —643 
| | 
at 
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TABLE IX 
DIFFERENCES AND TERMS FOR FOUR-PERIOD ANALYSIS 
Rab- Hours after Injection Sums of Products (y) 
bit Totals 
No. |0 =z] 1.5 3.0 4.5 6.0 (y) | Linear | Quad- | Cubic 
ratic 
1 —32 7 —76 | —108 | —120 | —297 | —413 71 —31 
2 24; —82 —1 | —187 | —222 | —492 | —606 | —116 418 
3 1 10 31 27 55 123 131 7 57 
4 —125 —59 23 18 | —108 | —126 | —152 | —208 —34 
5 —16 17 | —121 | —147 | —113 | —364 | —416 172 | —52 
6 —23 1 15 59 66 141 239 —7 —67 
7 —39 42 | —129 | —109 | —183 | —379 | —655 97 | —285 
8 27 110 | —28 47 153 282 204 244 | —182 
9 —16 63 —63 —73 —30 | —103 | —289 169 —63 
10 0 103 35 94 32 264 | —154 6 | —248 
11 21 46 | —59 -1 —21 —35 | -—143 85 | —241 
Sub- 
totals | —178 258 | —373 | —380 | —491 | —986 |—2254 520 | —728 
12 31 —69 —42 13 —8 | —106 238 —48 | —104 
13 —53 —11 | —106 24 102 9 469 173 | —277 
14 -—60 | —56 —93 30 10 | —109 321 ‘17 | —303 
15 8| -98| —53 77 66 -8 622 —56 | —226 
16 15 —29 —76| -—31 —22 | —158 66 56 | —128 
17 —43 9 —59 —33 —60 | —143 | —181 41 | —147 
18 —60 31 69 30 —6 124 | —150 —74 80 
19 —43 —-4| -42| -62 —76 | —184 | —236 24 —12 
20 —35 —11 —54 -17 13 —69 109 73 —87 
21 —5 -—8 —41 —4 56 3 229 93 —47 
22 —21 —20 —12 90 45 103 297 —53 | —241 
Sub- 
totals | —266 | —266 | —509 117 120 | —538 | 1784 246 |—1492 
Differ- 
ences 88 524 136 | —497 | —611 | —448 |—4038 274 764 


and the average differences are obtained from the differences in the last 
row of Table VII after division by 44 (= 11 X 4). 
Finally, the weighted difference of means for the four-period analysis 


is 


A, +3A,+3A;+ A, _ B, + 3B, + 3B; + B, 
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and the average differences are obtained from the differences in the last 
row of Table IX after division by 88 (= 11 X 8). 

The average differences between the two insulin mixtures at the four 
bleeding times after injection and after two, three, and four periods are 
given in Table XI and shown graphically in Figure I. 


TABLE XI 


WEIGHTED AVERAGE DIFFERENCES (A MINUS B) IN MG. PERCENT 
OF BLOOD SUGAR 


Number of Test Periods 

Hours After Injection 
3 3 : 4 
1.5 3.9 5.0 6.0 
3.0 —2.5 -—0.9 1.5 
4.5 —8.9 —7.4 —5.6 
6.0 —6.9 -—7.1 —6.9 

t104 


3.0 45 6.0 
HOURS AFTER INJECTION 


FIGURE I. CURVES OF AVERAGE WEIGHTED DIF- 


FERENCES AFTER TWO TEST PERIODS (———»), 
THREE TEST PERIODS (....), AND FOUR TEST 
PERIODS ( 


In the two-period and three-period tests Mixture B (which was 
prepared to contain five percent less protamine) gave lower blood sugar 
values than Mixture A at one and one-half hours after injection, but 
thereafter permitted a more rapid recovery to normal blood sugar 
levels. In the four-period test, more rapid recovery occurred four and 
one-half hours after injection. Apparently, the five percent decrease 
in protamine content of Mixture B resulted in a more drastic initial 
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reduction of blood sugar and a less prolonged effect, which is what 
would be expected on the basis of experience with protamine-insulin 
mixtures. 

The F-ratios obtained after two, three, and four periods are sum- 
marized in Table XII. 


TABLE XII 
F-RATIOS AFTER TWO, THREE, AND FOUR TEST PERIODS 


Number of Test Periods 
Source of Variation 


wo 


Totals 2.53 1.00 0.26 
Linear 4.00 7.02° 9.16" 
Quadratic 5.95* 1.89 0.17 
Cubic 0.53 0.71 0.90 


*Significant (P < 0.05) 
**Significant (P < 0.01) 


The “Totals” is a measure of the mean differences between the two 
insulin mixtures averaged over all four bleeding times after injection. 
That none of the ‘“Totals” are significant was not surprising since the 
initial differences were positive and the subsequent differences negative. 
In general, varying the protamine content of a long acting insulin 
mixture does not displace the blood sugar curve, but simply alters its 
shape. Because of this, mean differences are often of little value in 
comparing insulin preparations, whereas differences in the components 
of regression are of the greatest importance. 

The linear, quadratic and cubic terms under ‘Source of Variation” 
in Table XII are a measure of the differences in the blood sugar curves 
resulting from the two insulin mixtures. Practice is necessary to obtain 
facility in interpreting the ‘‘difference” curve (Fig. I) without simul- 
taneous reference to the actual blood sugar curves for each mixture. 
Brief consideration, however, will clarify the general principles involved. 
A horizontal, straight line at the “0” ordinate would indicate identical 
response to the two mixtures. <A horizontal, straight line at any ordinate 
other than 0 would indicate parallel response, or a simple curve dis- 
placement. Any “difference” curve with a significant slope, or significant 
curvature, indicates fundamental differences in the time-response curves 
of the test preparations. 

For the two-period analysis the quadratic component was significant 
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at the five percent level. The linear component was significant at the 
five percent level for the three-period analysis and at the one percent 
level for the four-period analysis. The reason for this is readily apparent 
from Figure I. The curve of differences for the four bleeding times 
tends overall, toward linearity as the number of test periods increases 
from two to four. This probably was due to the rabbits becoming less 
sensitive to insulin in the last two periods of the experiment. It will be 
noted from Table II that the blood sugar levels for the fourth period 
consistently were higher than those for the other periods. 

Adjustment for initial blood sugar levels by covariance had no effect 
on the final results and in no case was there a significant reduction in 
error due to covariance. This was to be expected since a preliminary 
analysis of the initial values had shown that although there were signifi- 
cant period differences, the comparative levels were consistent for the 
two groups of rabbits from period to period. 

The real difference between the two mixtures was small. It may be 
fortuitous that this was found to be significant in the simple cross-over 
experiment. On the other hand, significant differences persisted as the 
experiment was prolonged to include switchback and double switchback 
designs, despite the apparent decrease in sensitivity of the rabbits. 

The extended cross-over designs yielded essentially the same results 
as the simple cross-over experiment. At all states of the investigation, 
there was evidence of a more rapid recovery of blood sugar levels in 
animals receiving the mixture with less protamine. The gain in dis- 
criminatory power resulting from the switchback designs does not seem 
to be commensurate with the additional cost, and danger of animal loss 
associated with these extended tests. Replicated simple cross-over de- 
signs would be more efficient from the standpoint of economies and might 
be expected to yield information of equal, or better, statistical validity. 

Thanks are due to Mr. Robert A. Harte, Research Administrator for 
Sharp & Dohme, for critical reading of the manuscript; to the referee for 
valuable suggestions as to the form of the analysis and the presentation 
of data; and to Dr. J. R. Monroe, University of North Carolina (State), 
for suggestions on the covariance procedure. 
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A SAMPLING INVESTIGATION OF THE EFFICIENCY 
OF WEIGHTING INVERSELY AS THE ESTIMATED 
VARIANCE* 


Wiiuram G. CocuraN AND SARAH PorTER CARROLL 


Johns Hopkins, Baltimore, Md. 
and 
Institute of Statistics, Raleigh, N.C. 


1. INTRODUCTION 

Suppose that we have a number of estimates z,(i = 1, 2, --- k), 
normally and independently distributed about the same mean y with 
different variances co, . If the values of the o; are known, the best 
estimate of u is generally agreed to be the weighted mean 


i=1 


k 
1 
= w.a;/w, where w; = = > w; 


If the o; are not known, but we possess estimated variances s; , 
based on n,; degrees of freedom, respectively, analogy suggests the use 


of a weighted mean with weights inversely proportional to the estimated 
variances. This mean is 


k 

Data of this kind may occur when k laboratories make separate 
determinations zx; of the same physical or chemical quantity, each with 
an estimated standard error, or when a summary is being made of the 
results of k replicated experiments, in each of which the difference z;, 
between a specified pair of treatments has been observed. In practices 
it cannot be taken for granted that the observations x; are all estimate, 
of the same mean u, because personal biases or local conditions of ex- 
perimentation may render this assumption false. The discussion in 
this paper is confined to situations in which the assumption holds. 


*Research conducted under a contract with the Office of Naval Research. 
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Some results about the distribution of ~, are known. When the 

degrees of freedom n, are all equal to n, the limiting distribution of 

4 , as the number of estimates k tends to infinity, is normal, (1), with 

mean and variance 


a) | 


{4 The proof requires that n > 8 and that the o; are bounded above 
7 and below. When the n, are not equal, the limiting variance takes 
the more complex form 


k 
new; 


For practical applications, these results may be used as approxima- 
tions when the number of estimates k that are being combined is large. 
Until recently, no information has been available as to how well the 
results apply when k is small. However, Meier (2) has given an approxi- 


mation to V(z,), valid for any k, but neglecting terms of order 1/n? . 
His result is 


— w) | (3) 


2< 
V(z,.) =. + 

Variance formulas (1), (2) and (3) are useful for comparing the 
precision of ¢, with that of other simple estimates of u.—in particular 
with the unweighted mean of the x; . These formulas cannot be used, 
however, to attach a standard error to an actual value Z, that has been 
obtained from a set of data, because the formulas involve the unknown 
true weights w; . For this purpose, Cochran (1) showed that an un- 
biased estimate of the limiting variance, when the n; are equal, is : 


Vee) = 7 (4) 


n — 


Similarly, Meier (2) has shown that an unbiased estimate of (3), 
neglecting terms of order 1/n‘ , is 


Ny 


The present paper gives the results of sampling investigations which 
were carried out by the junior author (3) in order to learn something 
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about the variance of €, when n and k are both small. Although the 
scope of these investigations was restricted by the heavy computation 
involved, as is often the case with sampling studies, the results provide 
a partial check on the range of application of Meier’s formulas and give 
some information for values of n and k that are beyond this range. 


| 2. METHOD OF CALCULATION 


At first sight, the sampling investigations appeared a formidable 
task because of the multiplicity of variables. Even confining attention 
to the case where all n; are equal to n, it was desired to cover rather 
thoroughly the range of values of both n and k between 2 and 20. Then 
there was the problem of what sets of variances o; should be investigated. 

It appeared, however, that if the variance of <, was expressed in 
the form 


= 


the factor f(n, k) would be relatively insensitive to variations in the a; . 
: Several results support this conjecture. From equation (1) it 
follows that the limiting value of f(n, k), as k tends to infinity, is 
(n — 2)/(n — 4), for any bounded set of values of o; . Further, as n 
tends to infinity, for any fixed k, f(n, k) tends to 1, since the weights 
then become the correct weights. When k = 2, the correct variance 
can be obtained by numerical integration. Calculations of the variance 
by this method for a few sets of values of n, and n, (Porter, 1947) 
showed that f(n, k) changed by only a few per cent for o;/o2 lying be- 
: tween 0.1 and 10. 
Finally, some sampling computations of the variance were made for 
, three different sets of values of o; . In the first set, all 7 were taken as 
a 1; in the second, the values were 1/2, 1 and 2, each value holding for 
1 | one-third of the z,’s; in the third, the values were 1/4, 1 and 4. Results 
‘ are shown in table 1 for k = 3, 6, 12 and 15 and for n = 6, 10 and 20. 
As the values of «; become more unequal, f(n, k) tends to decline. 
_ The decreases are small in all cases in table 1, the maximum drop being 
) _ about 7 per cent for n = 6, k = 12 and 15. The results suggest that 
: computations made for o; all equal will tend to give values of f(n, k) 
), that are slightly too high but not far in error. Consequently, the 
principal calculations were made for the case in which all o; are equal to 1. 
The procedure was as follows. In samples in which the s‘ are fixed, 
and o; = 1, #, is normally distributed with mean » and variance 


wi/u? (7) 


t=1 
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TABLE 1 
Effect of inequality in o? on f(n, k) 
Values of f(n, k) 


k = number of estimates 
n o;? 3 6 12 15 
6 (1, 1, 1) 1.22 1.39 1.54 1.59 
(1/2, 1, 2) 1.22 1.39 1.52 1.54 
(1/4, 1, 4) 1.18 1.32 1.43 1.47 
10 |e Oe 1.15 1.22 1.28 1.30 
(1/2, 1, 2) 1.14 1.22 1.27 1.30 
(1/4, 1, 4) 1.19 1.25 
20 (1, 1, 1) 1.06 1.10 BI | .12 
(1/2, 1, 2) 1.05 1.09 1.11 1.11 
(1/4, 1, 4) 1.04 1.08 1.11 1.11 


The values of s; , and hence of #, = 1/s; , were obtained by squaring 
and adding from a table of normal deviates (4). The values of #; were 
then grouped in sets of k, each set yielding one value of the conditional 
variance of z, by substitution in (7). Enough sets were computed for 
each n and k so that the mean value of the variance over the group 
appeared stable (the average coefficient of variation of the mean was 
1.9 per cent). Finally, since w = k when all o; are equal to 1, the 
factor f(n, k) is k times this mean variance, as can be seen from equa- 
tion (6). 


TABLE 2 
Values of f(n, k) such that V(Z,) = f(n, k)/w 


k = number of estimates that are being combined 

n 2* 2 3 4 5 6 8 10 12 15 20 ot 
2 1.33 1.35 1.61 1.92 2.17 2.44 2.86 3.45 3.85 4.76 5.88 ©@ 

a 1.20 1.22 1.35 1.49 1.61 1.72 1.92 2.13 2.33 2.56 2.86 @ 

6 1.14 1.15 1.22 1.30 1.35 1.39 1.45 1.49 1.54 1.59 1.64 2.00 
8 1.11 1.11 1.20 1.23 1.28 1.28 1.33 1.37 1.39 1.43 1.45 1.50 
10 2.00 1.00 1.15 1.18 1.90 1.22 1.25 1.27 1.28 1.90 1.88 1.38 
12 1.08 1.00 1.12 1.15 1.16 1.18 1.18 1.20 1.20 1.23 1.23 1.25 
15 1.06 1.09 1.10 1.12 1.15 1.15 1.16 1.18 1.19 1.20 1.20 1.18 
20 1.06 1.05 1.06 1.00 1.10 1.10 1.11 1.12 1.13 1.13 1.13 1.13 


*These values obtained by the formula f(n, k) = (n+ 2)/(n+ 1) 
+These values obtained by the formula f(n,~) = (n — 2)/(n — 4) 
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3. SAMPLING RESULTS FOR THE VARIANCE zy 


The values obtained for f(n, k) are shown in table 2. For k = 2, 
with the o; all equal, the exact value of f(n, k) is easily found to be 
(n + 2)/(n + 1). These exact values appear in the first column of 
table 2: the corresponding values from the sampling investigation 
appear in the second column, and indicate good agreement with the 
exact values. 

For k = o, the values shown in table 2 are obtained from the formula 
(n — 2)/(n — 4)w for the variance of £, in the limiting distribution, as 
given previously in equation (1). 

Since the variance of , is 1/w when the weights are known exactly, 
the quantity f(n, k) is the factor by which the variance is inflated owing 
to errors in the estimated weights #; . Table 2 indicates that this 
inflation is less serious when k is small than when k is large. With 
weights based on 8 degrees of freedom, for instance, the variance is 
inflated by 50 per cent when many estimates are being combined, but 
only by 11 per cent when two estimates are being combined. 


4. COMPARISON WITH THE UNWEIGHTED MEAN 


A simple alternative to Z, is the unweighted mean ¢. A comparison 
of the precisions of and , is of practical interest, because there is no 
point in undertaking the extra calculation involved in ¢,; unless a 
reasonable gain in precision is anticipated. 

The situation most favorable to the unweighted mean is that when 
the o? are all equal. In this event the unweighted mean is fully efficient. 
Consequently, the values of f(n, k) in table 2 indicate the maximum 
inflation in variance that will occur if Z, is used in place of ~. Since 
in practice we do not know by how much the o; vary, we might be willing 
to regard this inflation of the variance as a premium paid for insurance 
against the possibility that the o; vary greatly (in which event Z would 
be of low efficiency). Table 2 suggests that if n exceeds 20, the premium 
is not high, but over most of the table the potential inflation of variance 
is unfortunately well over 10 per cent. 

More generally, the variance of Z is 


as compared with the approximate variance for the weighted mean, 


= 
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By comparing the two variances, working recommendations can be 
made about the use of the two estimates. The difficulty is, however, 
to know what amount of variation in the a; is typical of practical con- 
ditions. Comparisons will be given for the case k = 2. In this case 


4w,w, (n+ Iw’ 


where we have used the approximation to f(n, 2) given in the first 
column of table 2. 
Hence the relative precision of @,; to is 


(n+2)4ww, (n+2) 49 ’ 


where 9, = w,/wW, = o:/o; , is the ratio of the variances of the two 
estimates z, and x,. Table 3 shows the relative precision (in per cent) 
for a series of values of n and ¢. 


TABLE 3 
Relative precision (in per cent) of the weighted to the 
unweighted mean, for k = 2. 


= 62/0; 
n 1 1.5 2 3 4 6 
2 75 78 84 100 117 153 
4 83 7 94 lll 130 170 
6 88 91 98 Lig 137 179 
8 90 94 101 120 141 184 
10 92 96 103 122 143 187 
12 93 97 105 124 145 190 
20 95 100 107 127 149 195 
© 100 104 112 133 156 204 


If the variance ratio for the two estimates lies between 1 and 2, 
the maximum possible gain in precision from the weighted mean is at 
most 12 per cent, and the smaller values of n show a loss in precision. 
When the variance ratio exceeds 3, on the other hand, the weighted 
mean is superior, or as good, for all values of n down to 2, and the gains 
in precision may be substantial. 

To summarize, the unweighted mean is preferable if the ratio of the 
larger (true) variance to the smaller is not more than 2. If the ratio 


lies between 2 and 3, the unweighted mean appears preferable unless the 
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weights are each based on, say, at least 12 degrees of freedom. If the 
ratio exceeds 3, the weighted mean is preferable even if only 4 degrees 
of freedom are available to estimate the weights. 


5. COMPARISON WITH MEIER’S FORMULA 


Table 1 also provides a partial check on Meier’s approximate formula 
(3) for V(Z,), subject to the restrictions that the comparison covers only 
the case where the o; are equal and that the values in table 1 are them- 
selves subject to some sampling error. When all w,; are equal and all 
n, are equal, Meier’s formula reduces to 


Viz.) = + (8) 


nk 


The ratios of the variances in (8) to those in table 2 are shown in 
table 4. 


TABLE 4 
Ratio of variance given by Meier’s formula to variance in 
Table 2 
k = number of estimates 
n 2 3 4 5 6 8 10 12 15 20 © 
2 1.13 1.04 .91 83 75 50 41 33 00 
4 1.04 99 .93 87 83 .75 .68 63 57 52 00 
6 1.03 1.00 .96 94 92 .89 87 85 82 80 66 
8 1.01 of 86.8 94 95 .91 89 88 86 86 83 
10 1.01 98 .97 97 96 .94 93 92 92 90 90 
12 1.00 99 .97 97 97 .97 96 96 94 94 94 
15 1.01 99 .98 96 87 95 94 93 94 96 
20 1.00 1.01 .99 98 98 .98 98 97 97 98 98 


From inspection of table 4, Meier’s formula appears to underestimate 
the true variance, the relative underestimation increasing as / increases. 
If we are willing to regard a 6 per cent underestimation of the variance as 
tolerable, table 5, derived from table 4, shows the smallest values of n 
for which Meier’s formula is satisfactory in this sense. 

When at most 5 estimates are being combined, the sampling in- 
vestigation suggests that Meier’s approximation does remarkably well, 
being satisfactory for values of n as low as 4 or 6. 

The increase in the underestimation by Meier’s formula when k 
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TABLE 5 
Smallest values of n for which Meier’s formula underestimates 
by less than 6 per cent. 


Number of estimates, k 2 3 4 5 6 8 >10 


Smallest no. of d.f., 4 4 6 6 8 10 12 


becomes large can be attributed to the effect of terms in 1/n’ and 
higher orders. As k — o, Meier’s formula gives 


On the other hand, the correct limiting variance, by formula (1), may be 
written 


(n—2) 1 


2,8 , 32 
(1424+ 8484.) 


6. COMPARISON WITH MEIER'S FORMULA FOR THE ESTIMATED 
VARIANCE OF 24 
The sampling data were also used to investigate the performance of 
Meier’s formula (5) for the estimated variance of , . The procedure 
was as follows. The formula reads 


Ve.) “=.4 E +45 - (5) 
WwW W 
For any specified n and k, a large number of sets of k independent values 
of w#,; , each derived from n degrees of freedom, had already been as- 
sembled for the determination of f(n, k) as described in section 2. By 
substitution in formula (5), each set provided one sample value of 
v(Z,). The average 0(¢,) of this quantity, taken over all the sets, is an 
estimate of the true mean given by Meier’s formula. The ratio of 
v(Z,) to the variance of <, as found from the same group of sets, i.e. 
to f(n, k)/w, was then computed. The argument is that if Meier’s 
formula is unbiased, these ratios should fluctuate about a value close 
to 1. As before, the comparison is restricted to the case where the 
a; are all equal and the n, are all equal. 
The ratios are shown in table 6. Calculations were made only for 
n > 6, since Meier’s formula, which neglects terms of order 1/n’, was 
not expected to be valid for n < 6. 
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TABLE 6 
Ratio of average value of Meier’s estimated variance to variance in 
Table 1 
k = number of estimates 
n 2 3 4 5 6 8 10 12 15 20 

6 96 93 83 83 =. 81 77 7 86.72 70 68 
8 1.01 91 92 87.87 s2 8b 81 78 76 
10 1.01 97 95 93 .92  .89 89 =. 88 87 86 
12 99 98 97 95 .94 94 92 .92  .89 90 
15 97 98 99 .94 .93 93 93 .92 90 92 
20 1.01 1.01 98 .99 .98 96 99 .97 99 96 


For k = 2, table 6 indicates that Meier’s formula does extremely well 
down to n = 6. For higher values of k, the formula appears to under- 
estimate to a greater degree than the corresponding formula for the true 
variance (table 4). If, as in section 5, we accept an underestimation by 
6 per cent or less, the smallest values of n for which the formula is 
satisfactory are shown below for the different values of k. 


Number of estimates, k 2 3 4 5 6 8 >10 


Smallest no. of d.f., n 6 10 10 12 i2 12 20 


As with the formula for the true variance, the underestimation can be 
attributed to the effects of neglected terms of higher order in 1/n. In 
the limiting distribution when k > ~, the mean value of Meier’s formula 
(5) can be shown to be 


+4) (9) 


nw 


The first term, (n — 2)/nw, is the mean value of 1/2 in (5): the second 
term is the mean value of the expression inside the square brackets in (5). 

From equation (1), the correct limiting variance of Z, is (n — 2)/ 
(n — 4)w. For comparison with (9), this may be written 
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Inspection of (9) and (10) suggests that for large k, Meier’s formula . 


would be relatively free from bias if the terms in 1/n,; were changed to 
terms in 1/(n; — 4). For k = 2, on the other hand, the formula seems 
excellent as it stands. 

As an empirical attempt to improve the performance of the formula, 
we considered replacing the quantities n; by quantities n/ , where 


“n= 6 - (11) 
This substitution leaves the formula unchanged when k = 2, but gives 
it the correct mean value in the limiting distribution as k > o~. 

From the sampling data, the average value of the adjusted formula 
was worked out for each n and k, in exactly the same way as for the 
original formula. The ratios of these average values to the true variance 
as estimated from the sampling data are shown in table 7. Thus, 
table 7 presents the same data for the adjusted formula as did table 6 
for the original formula. 


TABLE 7 


Ratio of average value of the adjusted Meier’s formula 
for the estimated variance to the variance in Table 1. 


k = number of estimates 
n 2 3 4 5 6 8 10 12 15 20 
6 .96 1.06 1.04 1.10 1.12 1.14 1.15 1.14 1.14 1.18 
8 1.01 .98 1.04 1.01 1.03 1.00 1.01 1.01 .99 98 
10 1.01 1.02 1.03 1.03 1.02 1.01 1.02 1.01 1.00 1.01 
12 .99 1.01 1.03 1.01 1.01 1.02 1.00 1.01 .99 1.00 
15 .97 1.00 1.02 .9 .98 .99 1.00 97 .96 97 
20 1.01 1.02 1.00 1.01 1.01 .99 1.03 1.01 1.03 1.00 


The adjusted formula appears very satisfactory down to n = 8. 
For n = 6, the adjusted formula works tolerably well for k < 4, but for 
larger values of k it gives too high a variance. 


7. NUMERICAL EXAMPLE 


The application of the adjusted formula will be illustrated by the 
example presented by Meier. The data, from a paper by Snedecor (5), 
give the percentage of albumin in the plasma protein of normal human 
subjects, as obtained in 4 different experiments. The relevant figures 
appear in table 8. 
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TABLE 8 
Illustration of the adjusted formula 


Column 

(1) (2) (3) (4) (5) 

8,;? D; = 1/s;? nN; — n,;' =n; — 2.667 
1.0822 0.9241 11 2.9869 8.333 
0.5227 1.9133 14 1.9977 11.333 
4.7761 0.2094 6 3.7016 3.333 
1.1571 0.8643 15 3.0467 12.333 

=3.9110 


Columns (1)-(3) contain the basic data. Column (4) is formed 
from column (2). For the ni , we have from (11) 


4(k — 2) _ 


8 


These values appear in column (5). Finally, from the adjusted 
form of equation (5), 


k 


(3.9110)? 8.333 


(0.8643)(3 
12.333 


+ 


0.3302. 


This is about 6 per cent higher than Meier’s value of 0.3111 as found by 
the original formula (5). For the approximate number of degrees of 
freedom to be ascribed to this variance, Meier has suggested 


This comes out to 38.6 for these data. 
Before the publication of Meier’s formula, we had constructed an 
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empirical formula for the estimated variance, based on the results of 
the sampling investigation as follows: 


— 2) + 8) 
— — 4) + 12] 


where 7 is the average number of degrees of freedom in the k estimates. 
This formula was obtained by fitting a simple algebraic function to the 
values of f(n, k) which we found. It is subject to the same restriction as 
the sampling studies, in that it assumes f(n, k) to be independent of the 
values of the o; , whereas Meier’s formula has a sounder theoretical 
basis. 

Since Bliss (6), has used this formula (with acknowledgement) in 
one of his publications, it may be well to remark that down to n; = 6 
the formula agrees well enough with the adjusted Meier formula in the 
cases in which we have checked it, being slightly more conservative. 
In the present example we have # = 11.5, k = 4, and the formula 
gives 0.339 for the estimated variance. 


= 


SUMMARY 


We are given k independent estimates x;(i = 1, 2, --- k) of the same 
mean uw. The estimates are thought to be of unequal precision, and for 
the ith estimate we have an unbiased estimate s; of its variance o; , 
based on n; degrees of freedom. This paper describes the results of a 
sampling investigation undertaken some years ago in order to study the 
variance of the weighted mean 


The variances were obtained for values of k between 2 and 20, and for 
values of n; (assumed all equal) between 2 and 20. The variances found 
from the sampling investigation were expressed in the form f(n, k)/w, 
where w = ).1/o:. Since there is reason to believe that the factor 
f(n, k) is relatively independent of the o; , the sampling computations 
were made for the case in which all o; are equal. 

Since f(n, k) = 1 when the correct weights 1/0; are used, the factor 
f(n, k) gives a measure of the extent to which the variance of Z, is 
inflated owing to sampling errors in the weights #; . For given n, 
f(n, k) increases steadily as k increases, so that the inflation of variance 
is smallest when only a few estimates are being combined. 

The results for the variance of #, enable its precision to be compared 
with that of the unweighted mean @ When k = 2, taking o? as the 
larger variance, Z is preferable if «2/0; < 2, while Z, is preferable, for 
any value of n down to 4, if o/c} > 3. If this ratio lies between 2 and 3, 
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£, appears preferable if the weights are based on at least 12 degrees of 
freedom each. 

The results provide a partial check on approximate formulas recently 
developed by Meier for the variance and the estimated variance of Z, . 
In these formulas, terms in 1/n{ are ignored. The comparisons suggest 
that if 5 or fewer estimates are being combined, Meier’s formula for the 
true variance is satisfactory for values of n down to6. Itis satisfactory 
for any number of estimates if n is at least 12. 

Meier’s formula for the estimated variance 


ind — 


t=] 


appears adequate down to about n = 12, although it tends to be an 
underestimate. An empirical adjustment, which mades its performance 
adequate down to n = 8, is to replace n, in the formula by 


— Mk — 2) 
mam 
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THE RANDOM WALK OF 
TRICHOSTRONGYLUS RETORTAEFORMIS 


S. R. BroapBENT AND Davip G. KENDALL 


Magdalen College, 
Oxford 


This paper examines the behaviour of certain larvae in terms of a 
random walk of the “Brownian motion” type, and places on record 
the solution to a problem suggested by their characteristics. 

The larvae of the helminth Trichostrongylus retortaeformis are 
hatched from eggs in the excreta of sheep or rabbits, and wander ap- 
parently at random until they climb and remain on blades of grass 
where they are eaten by another animal, in whose intestines the cycle 
recommences. The question considered here is: what is the distribution 
of the larvae thus trapped on blades of grass? 


1. THE RANDOM WALK 


On the assumption that the x and y coordinates of a larva, measured 
on a plane with origin at the point of release, are independent Gaussian 
variables with mean zero and variance o*, their joint distribution is 

1 + 
no? en | dx dy 


Transforming to polar coordinates and integrating with regard to @, 
we find the marginal density at radius r to be 


r 
ex | dr (0<r<o), 


and thus we get the expected proportion contained in a circle of radius 
r (the radial cumulative distribution) to be 


] 
P, = 1 — exp E 
If now we assume that this distribution results from a random walk 


of the “Brownian motion” type we have the relation 
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2 


where w is a diffusion constant. (It is the rate of increase of the variance, 
o.) Then at time ¢ the density at radius r will be 


OF < foe} 
exp| (O<r<o) (1) 
and the expected proportion in a circle of radius r will be 
r | 

9 

= 1 — exp | |" (2) 
2. TRAPPING 


2.1. We now introduce as a first hypothesis the assumption that a 
larva, while performing the random walk, may in any short interval 
6t with probability \5¢ come upon a blade of grass and climb up it. 
Further, having gone up this channel, it is unable to turn round and 
must stay there. 

We write II(¢) for the probability that a larva is trapped on a blade 
of grass in the time interval (0, ¢); if x(t) dt is the distribution of time- 
to-trapping (the period of free motion) then 


I(t) = x(t) dt (0 <t<o). 


The probability that a larva is trapped in the interval (0, ¢ + 6¢) 
is the sum of II(¢#) and the joint probability that the larva is free at 
time ¢ and trapped in the interval (¢, t + dt): 


T(t + = M(t) + [1 — St. 


The solution of this equation is 


Hence 


a(t) = (Q<t<o). (3) 


The final distribution of the trapped larvae will be given by mul- 
tiplying together the expressions (1) and (3) and integrating the result 
with regard to t. We obtain 


k.(p)p dp (0 < p <~), (4) 


where p = rV 2r/w, and K,(p) is a standard Bessel function tabulated, 
for example, by Watson (1). 
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Using the differential equation, 
pKi(p) + K,(p) + pKo(p) = 0, 


we can integrate the p-distribution to get the radial cumulative distri- 
bution, 


F(p) = prob E > p Vs] = pK,(0) (5) 
A short table of values of F(p) is given below. 
TABLE I 
| 
p= 0.00 0.34 | 0.56 0.78 | 1.01 126 1.55 | 1.91 | 2.41 | 3.21 


F(p) = 


1.0 | 0.9 jos jos 03 |0.2 


2.2. Asa second hypothesis we assume that a larva is not necessarily 
trapped by the first blade of grass visited, but that it has a probability 
p of being trapped when it visits a blade of grass, and that now pét 
is the probability that a free larva will visit a blade of grass in a short 
interval ét. 

If we define for this case II,(¢) and z,(¢) dt as in 2.1, the probability 
that a larva is trapped in the interval (0, ¢ + 6t) is now the sum of 
II,(¢) and the joint probability that the larva is free at time ¢, visits a 
blade of grass in the interval (¢, ¢ + 6t), and is trapped there. That is, 


+ 6¢) = 11,(t) + [1 — II, p. 
The solution is now 
m,(t) dt = (0<t<o), 
The formulae (4) and (5) apply as before, with A = pu. 
3. ESTIMATION 


3.1. If we are given a sample of r-values observed at a given time t, 
it is a simple matter to estimate the ‘‘one-dimensional variance-rate”’ 
w, for this is equivalent to estimating the variance wt of a circular 
Gaussian population. We shall have 


= 


(if the sample is of size N), so that 
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will be an unbiased sufficient estimate of w with sampling variance w’/N. 


3.2. In the data considered below, the numbers of larvae in con- 
centric annuli at certain times are given. To get the estimate (6) we 
must evaluate >» r; for a particular time, when the values of r may be 
taken as independent. If there are n, larvae in the annulus bounded 
by circles of radius p and p + 1 units of length (>> n, = N), and if we 
assume that they are uniformly distributed over the annulus, we find 


+ 4)° + 4] (7) 
as an estimate of )> r?. 


For in the annulus (p, p + 1) the equivalent uniform density in 
numbers of larvae per unit area is given by 


D, = n,/(w(2p + 1)], 


and =r? over this annulus is therefore 


du = + 9? + 


Equation (7) follows by summation over p. 

There is therefore a correction of amount +N/4 to be made to the 
value of }>r; obtained by assuming the larvae in each annulus to be 
located midway between the bounding circles. 


4. DATA 


4.1. In Table II we give the data kindly made available to us by 
Dr. H. D. Crofton. In one experiment Crofton observed the wanderings 
of a total of 400 larvae; only a small number were placed at one time 
on the horizontal microscope slide. The larvae were released at the 
centre of the field of view, and at successive intervals of 5 seconds (up 
to a total of 30 seconds) the numbers of larvae in concentric annuli of 
radii 1, 2, --- , 12 units of 0.7 mm. were counted. 


4.2. If the data satisfy relation (2), log [1 — P,(¢)] plotted against 
r’/t should give a straight line passing through the origin, where P,(t) 
is the observed proportion of the number of larvae which lie within a 
circle of radius r units at time ¢. The slope of this line is inversely 
proportional to the parameter w. The graph of these points is given in 
Figure I; a straight line through the origin is a reasonable first approxi- 
mation. 
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FIG | 


(CROFTON'S DaTA) 


Los CY) acainsr see §4-2 


KEY: FIGURE | REPRESENTS L=5 SECONDS, 
2 ETC. 


2 3 


4.3. The estimates of w for the six successive times (which are not 
independent) using (7) are 
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The arithmetic mean of these estimates is 0.54. The standard error of 
each estimate at each time is about 0.027. 


TABLE II 


Wandering of larvae of Trichostrongylus retortaeformis 
in a horizontal plane 


(Crofton’s data) 
Number of larvae in annulus with outer radius 
Time in 
seconds 1 2 3 4 5 6 7 8 °® 0 HH FB 
5 53 124 118 81 21 3 ‘ 
10 32 60 142 88 41 20 11 6 
15 20 31 69 #124 78 45 31 : 2 ; 
20 11 40 65 94 96 47 #19 6 10 7 5 
25 3 72 87 37 @ 8 8 1 
30 13 22 48 40 77 90 53 34 #17 3 5 


Notes: (i) the unit of length is 0.7 mm. 
(ii) the first five row-totals are each 400; the last is 399. 


4.4. It will be seen from Figure I that there is some evidence of 
systematic deviation from a straight line. The deviation becomes 
more prominent when we compare the entries in Table II with the 
values given by (2), using the above estimates of w. It is only att = 5 
seconds that the value of x’ for this comparison is within the 5% signifi- 
cance level; at all other times it is well beyond the 1% significance level. 

It also seems probable that the values of w decrease with increasing 
time. Even though the six estimates are not independent, their range 
is about 7.1 times their estimated standard error. 

For these two reasons it is doubtful whether the model is adequate 
to describe the data fully. It may, however, be of some interest as an 
approximation. 

The authors wish to thank Dr. H. D. Crofton for permission to 
publish his observations, Dr. F. C. Frank who first suggested that it 
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might be worthwhile examining the problem from the present standpoint, 
and Dr. D. J. Finney for some helpful comments on the method of 
analysis. 


SUMMARY 


The distribution of larvae which are trapped on blades of grass 
while performing a random walk of the “Brownian motion” type is 
derived. The adequacy of such a random walk as a model for the 
wandering of larvae of Trichostrongylus retortaeformis is discussed in 
relation to empirical data supplied by Dr. H. D. Crofton. 


REFERENCE 
(1) Watson, G. N., Theory of Bessel Functions (Cambridge, 1922). 
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THE ANGULAR TRANSFORMATION IN QUANTAL ANALYSIS 


P. J. CLarincBoLp, J. D. Biacers, AND C. W. EMMENS 


Department of Veterinary Physiology, 
University of Sydney, N.S.W., Australia 


SUMMARY 


The angular transformation may be used in two ways in the analysis 
of quantal data. Transformation of the observed response (Eisenhart, 
1947) leads to a quick noniterative but approximate solution. If the 
expected response is transformed, an exact iterative maximum likelihood 
solution is available. Comparisons have been made which indicate the 
practical similarity of the two methods, though where additional 
accuracy is required one cycle of the maximum likelihood solution 
following the method of Eisenhart seems all that is required. 

To overcome difficulties with regions of 0 and 100% response in 
factorial experiments, parallelogram designs have been introduced. 


1. INTRODUCTION 


Two main approaches are available for the analysis of binomial 
(or quantal) data. The first is by a multiple regression technique 
based on the equivalent deviate transformation of Finney (1949), 
which originated in discussions of the pioneer studies of Gaddum (1933), 
Hemmingsen (1933) and Bliss (1934, 1935a, b), and is widely used in 
bioassay work. The second, leading to an analysis of variance, relies 
directly on appropriate transformation of the data. Where the data 
cover a wide range of proportions the angular (or inverse sine) trans- 
formation is relevant. Although the latter method has been applied 
in agricultural and operational research (Bliss 1937, 1938; Cochran 
1938, 1940; Eisenhart, 1947), and is described in several textbooks 
(Snedecor, 1946; Johnson, 1949; Brownlee, 1949 and Mather, 1949), 
it has not been widely used. The purpose of this paper is to discuss 
the application of the angular transformation in the design and analysis 
of multifactor experiments where the response is in quantal form. 
Examples will be taken, for illustrative purposes, from investigations 
into the response of the vaginal epithelium of ovariectomized mice to 
oestrogens. 
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2. THE ANGULAR TRANSFORMATION 


Since the angular transformation was introduced by Fisher in 1922 
its development has taken two different directions; first as a special 
case of the equivalent deviate transformation, and secondly for the 
equalization of variance where the variance is dependent on the mean. 

Consider a large binomial population in which a proportion P 
possesses a given attribute. Let p,(¢ = 1 --+ m) be the proportions 
of this attribute in successive samples of size n; 


the expectation of p = E(p) = P, 
and the variance of p = V(p) = PQ/n, where Q = 1 — P. 


2.1. In the equivalent deviate transformation. 


Finney (1949, 1952c) has discussed the equivalent deviate trans- 
formation. For any specified function f(v), a quantity Y, the equivalent 
deviate of P, may be defined as a monotonic increasing function of P 
by the equation 


P = [ fv) dv. 
A well-known substitution in this equation gives the probit transforma- 
tion. By another substitution the angular transformation may be 
derived, and is given by 


P = smn’ Y, (1) 


The analysis following this transformation may be carried out in 
one of two ways, (i) as a multiple regression, (ii) as an analysis of 
variance, both methods being iterative. The former method is com- 
puted in an analogous mannar to the probit plane technique described 
by Finney (1952a) with the added advantage from the computational 
point of view that weights are equal provided n is constant. The 
latter method has been described by Cochran (1940) and follows the 
usual analysis of variance procedures on the working angle values for 
each group. The working angles may be computed using the table of 
maximum working angle and range given by Fisher and Yates (1948), 
following estimation of provisional values by graphical or other means. 

If n is not constant, weighted analyses (weight proportional to n) 
are carried out, but owing to the loss of orthogonality the advantages 
of the analysis of variance are to a large extent lost. 


2.2. In the equalization of variance. 
Curtiss (1943) derived the general theorems for the simplification of 
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variance of random variables where the variance is dependent on the 
mean. Both Curtiss, and Eisenhart (1947) have applied the general 
theorem to the binomial case and have shown that the transformation 
defined by 


g= ¢(p) = 2 arcsin Vp, (2) 


where 0 < p < 1, and 0 < ¢ < zo (radian measure), fulfils the re- 
quirements. The form of the transformation suggested by Yates to 
Bartlett (1936), and by Fisher to Bliss (1937) was 


p = sin’ Y, where 0 < p < 1, 0 < Y < 90° (degree measure) (3) 


_ 820.7 

= = + (2) (4) 
When n is small it is found that in the extreme ranges of proportions 
the variance of Y is still largely dependent on p. Bartlett (1936, 1937) 
introduced an empirical correction factor when sample proportions 
are 0 and 1, and these have since been derived graphically by Eisenhart 
(1947). The transformation incorporating Bartlett’s adjustment be- 
comes 


Y,(p) = aresin Vp, where 0<p<1, 0< Y < 90° 
Y¥,(0) = aresin V/1/4n (5) 
Y,(1) = 90 — Y,(0). j 


From all practical points of view the transformation defined by (5) 
fulfills the requirement that the variance is independent of the mean, 
provided 0.05 < P < 0.95, and n > 10 and is constant for all samples. 
Eisenhart (1947) has calculated the variance at P = 0.5 for small 
samples (see Table 1). 

An independent study of the binomial variable has been made by 
Ghurye (1949) who arrived at transformations for the equalization of 
variance practically equivalent to the above. 

Following these transformations a one stage analysis of variance is 
carried out. 


2.3. Theoretical comparison of the two approaches. 


The binomial variate poses two problems which must be overcome 
before it may be used in small sample estimation problems, (1) dis- 
continuity, (2) information depending on the mean. The discontinuity 
may be overcome by replacing the discontinuous observations by a 
continuous distribution of working responses. These are based on the 
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TABLE 1 
Table of Bartlett’s correction and variance of Y,z at P = 0.5 for sample size 10-50 
inclusive. The table is in degrees to facilitate use of the correction with the 
transformation as tabulated by Bliss (1937b) and has been prepared from the table 
by Eisenhart (1947). 


Sample; Y (0) Ya (1) | Sample “p(0) Y,(1) Vo.sV 

size size 
10 9.10 80.90 92.4 30 5.24 84.76 28.3 
1l 8.67 81.33 83.0 31 5.15 84.85 27.4 
12 8.30 81.70 75.3 32 5.07 84.93 26.5 
13 7.97 82.03 68.9 33 4.99 85.01 25.7 
14 7.68 82.32 63.6 34 4.92 85.08 24.9 
15 7.42 82.58 59.0 35 4.85 85.15 24.1 
16 7.18 82.82 55.0 36 4.78 85.22 23.5 
17 6.96 83.04 51.5 37 4.72 85.28 22.8 
18 6.77 83.23 48.5 38 4.65 85.35 22.2 
19 6.59 83.41 45.8 39 4.59 85.41 21.6 
20 6.42 83.58 43.3 40 4.54 85.46 21.1 
21 6.27 83.73 41.1 41 4.48 85.52 20.5 
22 6.12 83.88 39.2 42 4.43 85.57 20.0 
23 5.98 84.02 37.4 43 4.37 85.63 19.5 
24 5.86 84.14 35.8 44 4.32 85.68 19.1 
25 5.74 84.26 34.2 45 4.27 85.73 18.6 
26 5.63 84.37 32.9 46 4.23 85.77 18.2 
27 5.52 84.48 31.6 47 4.18 85.82 17.8 
28 5.42 84.58 30.5 48 4.14 85.86 17.5 
29 5.33 84.67 29.4 49 4.10 85.90 17.1 
50 4.05 85.95 16.7 


For values n > 50 theoretical variance is given by VYg = 820.7/n. If Bartlett’s 


correction required 
1 
Y,(0) = arcsin 


Y,(1) = 90 Y (0) 


expected responses derived from consideration of the collective data 
of the experiment (Fisher, 1935a). By a simple non-linear change of 
the scale of measurement the information may be rendered constant 
and dependent only on the size of the group, and the angular trans- 
formation was introduced for this purpose. The combined use of the 
angular transformation and the notion of working responses gives an 
exact maximum likelihood solution to the problem (2.1), and with it 
exact fiducial inference is possible. 
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An approximate solution is provided by the change of scale only 
(with arbitrary corrections) and ignoring the effect of discontinuity 
(2.2.). In this case homoscedasticity has been imposed on the data 
only for large samples where the effect of discontinuity is unimportant. 
For small samples, however, the information is still dependent on the 
mean. Under these conditions exact fiducial inference is not possible 
(Fisher, 1935b). Since the approximate method is so easy to compute 
it is of importance to ascertain its efficiency, and in section 3 this 
question has been answered by a practical comparison of the two 
methods. 


2.4. Comparison with other transformations. 


Several transformations have been advocated for the analysis of 
quantal data. Although the probit has been recommended as most 
suitable in bioassay (e.g. Bartlett, 1947) practical comparisons between 
the probit, angular, logistic and rectangular transformations by efficient 
methods have detected no important differences between them (Finney, 
1947, 1952c; Biggers, 1951). Both Finney and Berkson (1949) have 
discussed the iterative solutions with reference to the logit transforma- 
tion and recently Dyke and Patterson (1952) have introduced a basically 
similar method for factorial experiments, also using the logit trans- 
formation. Finney (1952c) has discussed in more detail comparisons 
between these transformations and comes to the conclusion that all 
but the rectangular are very nearly the same between 0.02 < P < 0.98. 
Thus, when the data cover a wide range of proportions no one trans- 
formation will usually give a significantly better fit than another, and 
the choice will therefore rest on practical expediency. 

Other reasons, usually put forward particularly with reference to 
to the probit transformation, are concerned with its supposedly greater 
correspondence with reality. This seems of no consequence when the 
object of a test or bioassay is demonstrably served quite as efficiently 
by the use of much simpler mathematical models. The infinity of 
; equally valid models has been discussed by many previous authors 
(c.f. Emmens, 1948). 


3. PRACTICAL EVALUATION OF THE METHODS 


3.1. A simple 4? experiment to illustrate methods. 


} During studies on the effect of metabolic inhibitors on the vaginal 

response to oestrogens it was necessary to ascertain whether the response 

was purely additive or whether interactions occurred. Table 2 shows 
the design and the results of the experiment. 
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TABLE 2 
The effect of locally administered monoiodoacetate and oestrone on the percentage 
response of ovariectomized mice. 


Oestrone yg.) Monoiodoacetate (ug.) 
12.5:-3)* 25.0,-1) 50.0.) 100 .0;3) 
2-3) 25 15 15 0 
4-1) 25 15 20 5 
8a) 45 35 15 15 
16:3) 70 50 35 20 


Number of animals per group = 20 


*Logarithmic coding shown as a subscript in parenthesis. 


3.11. Analysis of variance of working responses. 


The observed percentage responses were transformed by Bliss’ 
(1937) table and plotted over the abscissa log dose oestrone plus log 
dose monoiodoacetate. A regression plane was fitted by eye and the pro- 
visional values obtained. The method is similar to the probit plane tech- 
nique of Finney (1952a). These provisional values were corrected by the 
corresponding maximum working angle and range tabulated by Fisher 
and Yates (1948) to produce the working angles on which a standard 
analysis of variance was carried out (Table 3). The theoretical variance 


TABLE 3 
Analyses of variance (3.11) of transformed data of Table 2. 
Source of variation D.f. Sum of Mean F P 
squares square 

Oestrone (3) (1030.3) 

Linear 1 982.1 23.9 <0.001 

Quadratic 1 47.2 1.2 >0.05 

Cubic 1 1.0 0.0 >0.05 
Monoiodoacetate (3) (1123.0) 

Linear 1 1087.1 26.5 <0.001 

Quadratic 1 11.4 0.3 >0.05 

Cubic 1 24.5 0.6 >0.05 
Interactions 9 208.4 23.2 0.6 >0.05 
Theoretical variance © 41.0 
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is given by 820.7 + 20. A test of significance was first carried out 
between the interaction mean square with 9 degrees of freedom and 
the theoretical variance with infinite degrees of freedom. Since no 
significant difference was found the mean squares of individual items 
of Table 3 were tested against the theoretical variance. 

Throughout this paper the variance ratio test will be used, although 
in the case of a theoretical variance with infinite degrees of freedom 
the test degenerates to a x’. If the error variance is significantly 
different from the theoretical variance the former must be used to 
examine the main effects. In these circumstances the F test is necessary, 
and in order to avoid the use of both tests in the analysis, the more 
general one has been employed. 

Orthogonal polynomials were used to separate the oestrone and 
monoiodoacetate effects respectively into their linear, quadratic and 
cubic components. In each case, by applying the methods described 
by Bliss and Marks (1939)*, the linear components were used to estimate 
the regression coefficients and their variances. From these regression 


coefficients, and the general mean response, the following equation was 
obtained 


Y = 7.06X, — 7.41X, + 29.92, 
where Y is the angle of response, 
X, is the logarithm of the dose of oestrone, 
X, is the logarithm of the dose of monoiodoacetate, 
b, is the regression coefficient of X, , 
b, is the regression coefficient of X, . 


The iterative procedure was continued until the difference between 
successive regression coefficients was less than 1/10th of the standard 
errors of the final coefficients respectively (Fisher and Yates, 1948). 
The final equation was 


Y = 7.01X, — 7.38X, + 28.98, 


and this was used to estimate the expected number of positive reactors 
in each group in order to calculate x” for goodness of fit by the “long- 


*In the original publication of Bliss and Marks the equation for the variance of the regression 
coefficient for an even number of groups is given by mistake as 
2Ve 4Ve 
I?S8nyk,? 
where—Vb is the variance of the regression coefficient, 
Ve is the variance associated with random sampling (in this case the theoretical variance) 
I is the logarithmic interval, 
Np is the number per observation (in this case unity), 
k, is the orthogonal coefficient. 
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TABLE 4 
Analysis of variance (3.12) of transformed data of Table 2. 


Source of variation D.f. Sum of Mean F P 
squares square 
Oestrone (3) (1038 .3) 
Linear 1 978.6 22.6 <0.001 
Quadratic 1 57.8 1.3 >0.05 
Cubic 1 1.9 0.0 >0.05 
Monoiodoacetate (3) (1063 . 1) 
Linear 1 1044.0 24.1 <0.001 
Quadratic 1 3.8 0.1 >0.05 
Cubic 1 15.3 0.4 >0.05 
Interaction 9 181.3 20.2 0.5 >0.05 
Theoretical variance © 43.3 


hand” method (see Emmens, 1948). Grouping of extremely low or 
high expected responses was carried out as described by Finney (1952a), 
dropping a corresponding number of degrees of freedom. Of the re- 
maining degrees of freedom three are used in the estimation of param- 


eters. Thus the x’ value has 7 degrees of freedom and is 3.64 (0.9 > 
P > 0.8). 


3.12. Analysis of variance of empirical responses. 


After the observed percentage responses were converted to the 
angular values by means of Bliss’ (1937) table, and the value of Bartlett’s 
correction for the zero response inserted from Table 1, a standard 
analysis of variance was carried out (Table 4). The theoretical variance 
for n = 20 was obtained from Table 1. 

The following equation was obtained from the analysis 


Y = 7.00X, — 7.23X, + 28.94, 


and a goodness of fit test applied (x{;; = 4.60, 0.8 > P > 0.7). 


3.2. A comparison of estimated regression coefficients. i 

A series of factorial experiments have been analysed using both 
methods for the purposes of comparison. The method 3.12 was applied 
first and used to provide provisional values for analysis in two cycles 
by method 3.11. Regression coefficients were calculated at each stage 
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and are shown in Table 5. The vertical differences between each of 
the eighteen sets of regressions are of no consequence as they do not 
represent estimates of the same parameter. 


TABLE 5 
Comparative values of regression coefficients and x? for goodness of fit obtained in 
six factorial experiments analysed by the method of Eisenhart (1947) followed by 
two cycles of the Fisher-Bliss method. 


Exper- Eisenhart Fisher-Bliss 
iment 
No. b 8, First | Second Sp 
cycle cycle 
b b 
:. 1.6 | 2.4) 1.8 1.8 | 2.3) 
19.7 | 2.1 12.6 | 12.6 | 2.0} 
—6.9 | 2.1/ = 16.2 —6.4| -6.4 = 14.0 
—11.4| 2.1/0.5 >P>0.3|] -11.5| -11.5|2.0)05>P>0.3 
2. 2.8 2.1) 2.7 2.7 | 2.0) 
13.7 | 2.1 13.3 13.3 2.0) 
10.4 | 2. ic, = 7.25 10.1 10.1 2. 3) x 271 = 8.76 
—9.1 | 2.1)0.98>P>0.95 | -—8.5| -8.6| 2. 95>P>0.90 
3 —0.3|2 —0.6| -—0.6| 2.3 
10.9 | 2.1 xis) = = 14.7 10.9 11.0 | 2.0 Xt1s) = 12.1 
—14.6 | 2.1)0.7>P>0.5| —14.9| -15.0| 2.009>P>0.8 
4. 7.0 | 1.5\x?,, = 4.60 7.1 7.0 | 1.4\x?,, = 3.64 
-7.2| 1.5/0.8>P>0.7| -7.4| -7.4|1.4f09>P>0.8 
5. 17.7 | 3.3\x?10) = 9.20 17.6} 17.6 | 3.2\x?19) = 7.81 
9.8|1.5/0.7>P>0.5 9.7 9.7/1.4/0.7> P>0.5 
6. 12.7 | 1.4 12.6| 12.6] 1.3 
3.1 xi101 = 7.46 2.6 2.7 = 7.69 
—2.7|0.9)0.7>P>0.5| -2.7| -2.8|0.9)0.7>P>0.5 
xi78) 59.4 54.0 
P =0.77 P = 0.90 


Difference between methods: x7,, = 5.4,0.05 > P > 0.02 


Making horizontal comparisons between the three values of each 
regression few differences are seen and while the difference between 
method 3.12 and 3.11 is noticeable, although negligible for practical 
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purposes, there is no sensible difference between the two cycles of 
method 3.12. This illustrates the statement of Fisher (1925) that “in 
approaching the maximum likelihood solution by successive approxi- 
mations --- starting with an inefficient statistic, a single process of 
approximation will in ordinary cases give an efficient statistic differing 
from the maximum likelihood solution, by a quantity which with 
increasing samples decreases as n'.” A second cycle results in an 
efficient statistic differing from the maximum likelihood solution by a 
quantity in increasing samples, of order n~*”’. 

The main difficulty with the maximum likelihood solution where 
the number of variables is three or more lies in the determination of 
provisional estimates by graphical means. If the graphical estimates 
are far from the truth many iterations may be necessary. The above 
results indicate that the method 3.12 gives a very good basis for the 
maximum likelihood process and in many practical situations may be 
sufficiently accurate in itself,—the decision, however, will rest with the 
investigator. 


4. 0 AND 100% RESPONSES 


The problem of 0 and 100% responses has been widely discussed. 
Transformations whose limits are defined in the interval (— ©, ©) are 
supposed to overcome this difficulty since 0 and 100% response groups 
can be considered to be within the distributions, and are assigned 
expectations. In practice, however, where several of these occur on 
the regression lines, bad fits are found whichever methods are used. 

Where a distribution with a finite range is used as a mathematical 
model two kinds of extreme response may be recognized, one arising 
as a random fluctuation in a sample with an expectation 0 < P < 1 
and one which will always occur outside the finite range. How may we 
distinguish these in practice? Past experience with the experimental 
material, or preliminary pilot experiments, will indicate the mean and 
range of definition and, provided the treatment combinations given 
are in this range any extreme responses may for all practical purposes 
be assumed to be within the finite range of the distribution, can be 
given expectations, and used additively. 

In factorial designs, where all combinations of factor levels are 
employed, large regions of 0 and 100% responses, which contribute 
nothing to the knowledge of effects, are sometimes inevitable. Whether 
expectations are given to these values, e.g. with the probit transforma- 
tion, or corrections made as in the case of Bartlett’s adjustment with 
the angular transformation, regions of them will give bad fits. Clearly 
the solution of the problem is one of experimental design based upon 
the evidence of small comprehensive pilot experiments. 
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4.1. Parallelogram designs. 
Consider an n X m factorial design (Fig. 1). Let the treatment 
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FIGURE 1 
values corresponding to n be X,,(¢ = 1 --- n) and those corresponding 
to m be X.,(j7 = 1 --- m). If preliminary investigations indicate that 


the areas shown by circles are regions of extreme responses we may 
choose to investigate the levels falling between them in the region of 
useful observation. Thus while all levels of X, are used, only certain 
levels of X, are employed at each of the levels of X, . If the restricted 
levels of X, are displaced in equal steps as we pass along the values of 
X, , the design takes the form of a parallelogram, and the functional 
relationship between the values of X, and X, is linear. These designs 
may be analysed by either the analysis of variance or the regression 
plane technique. 
5. ANALYSIS OF VARIANCE IN PARALLELOGRAM DESIGNS 

The values of X,, and X,, are fixed by the experimental design; 
corresponding to each value of X, , are sets of values of X, , each set 
being of equal size but of different values. These numbers collectively 
form a skew array, and for application of the analysis of variance must 
be transformed into a rectangular array. 

For treatment values situated on a line parallel to the sloping side 
of the parallelogram in the figure, X, and X, are connected by a relation 
C,X, + C,X, = constant = X# say. 

Let us suppose that the intervals between consecutive values on this 
line are h, and h, respectively for X, and X, . 


If X* can be chosen so that Y* = X, when X, = 0, 
then = 1 and Ci Crk = 0. 


Be 
~ 
4 
4 
. 
me 
Al 
| 
; 
— 
— 
4 


478 BIOMETRICS, DECEMBER 1953 


Thus we find C, = he , 
1 
. hz 
1.e. ae = + . (6) 


For simplicity the values of X,, and X,., if equally spaced may be 
coded orthogonally. 


By means of the analysis of variance the constants of the equation 
Y— =},X,+ (7) 
may be investigated. 


The true regression of Y on X, is obtained by the substitution of 
(6) in (7), rearrangement and simplification, whence (7) becomes 


Y — Y = where bf =b,+C,b,. 
Since the regressions of Y on X, and X% have been estimated inde- 
pendently the variance of b¥ is given by 
Vos = Vb, + CiVb,. 


Since 6% and Vb* are computed from b, and b, it is necessary that 
the system of polynomial coefficients used to code X, , X, and X% 
should have identical linear scale intervals, i.e. the same \ for linear 
regression (see Fisher and Yates, 1948). 


5.1. A 4 example. 


In this experiment the effect of the time between two injections of 
oestrone on the vaginal response has been investigated. Preliminary 
experiments suggested that a log-linear relationship existed between 
the time interval (X,) and dose (X,). The doses of oestrone were 
chosen in the region of useful observation. The plan and results of the 
experiment are shown in Table 6. 

The transformation 


where X? = -3, —1, 1,3, 2), 


is applicable in this case. After a full analysis of variance (3.12) was 
made of the transformed data (Table 7), regression coefficients and their 
variances were calculated, viz. 


Y — 42.90 = 1.87X, + 9.81(X, + 2X,), 
or Y = 17.74X, + 9.81X, + 42.90 
Vb = Vb, + 4Vb, = 5Vb, , since Vb, = Vb., = 10.84. 
Vb, = 2.17. 
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TABLE 6 


The effect of the interval of time (X,) between divided doses of oestrone (X,) on the 
percentage vaginal response in ovariectomized mice. 


Time interval between injections (hours) 
Oestrone (10~ yg.) 
2.67;-1) 8.9) 24 .0;3) 
20 
45 
4(~5) 15 50 
30 70 
16,-1) 30 35 
32.1) 55 65 
64:3) 25 60 
1285) 30 70 
2567) 60 | 
85 | 


A x’ goodness of fit test showed that the model fitted the data (x71. = 
9.20,0.5 > P > 0.8). 

For the purposes of comparison method 3.11 was also applied to 
the data (Table 8). The following equation was obtained after two 
cycles 


Y = 17.58X, + 9.70X, + 42.92, 
Xi10) = 7.81, > 


TABLE 7 
Analysis of variance of the data of Table 6 following the transformation X,* = 
(X_ + 2X,), where X;2 is the dose of oestrone and _X; is the time interval (method 3.12). 


Source of variation D.f. Sum of Mean F FP 
squares square 


Oestrone (3) (1949.0) 
Linear 1 1922.7 46.9 <0.001 
Quadratic 1 3.8 0.9 >0.05 
Cubic 1 22.5 0.5 >0.05 
Time interval 3 260.3 86.8 2.0 >0.05 
Interaction 9 202.5 22.5 0.5 >0.05 
Theoretical variance © 43.3 


5.2. A 3° example. 


The effect of the time interval (X,), the number of injections in 
this interval (X;) and the dose of oestradiol-3:178 (X.) was investi- 
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TABLE 8 


Analysis of variance of the data of Table 6 following transformation X,* = X, + 2X,, 
where X2 is the dose of oestrone and X, is the time interval (method 3.11). 


Source of variation D.f. Sum of Mean F P 
squares square 
Doses (3) (1910.3) 
Linear 1 883.7 44.7 <0.001 
Quadratic 1 3.1 0.1 >0.05 
Cubic 1 23.5 0.6 >0.05 
Time interval 3 254.5 84.8 2.1 >0.05 
Interactions 9 191.0 21.2 0.5 >0.05 
Theoretical variance © 41.0 


gated with each factor at three levels. The design and results of the 
experiment are shown in Table 9. 


TABLE 9 


The effect of time interval (X,), frequency of injection (X3) and dose of oestradiol- 
3:178(X2) on the percentage vaginal response of ovariectomized mice. 


Time Frequency of Dose of oestradiol-3:178 (10~* ug.) 
Interval injection in 
(hours) time interval | 2.63;~0,415)* 5. 25c0..585) 10. 50¢1..s85) 
16-1) 2-1) 50.0 66.7 75.0 
83.3 100.0 100.0 
8a) 58.3 66.7 83.3 
1 75-1) 3.50.0) 7.00.1) 
24.0) 2 50.0 75.0 75.0 
4 66.7 66.7 100.0 
8 41.7 75.0 83.3 
1.17 ;~1.585) 2 .33~0.585) 4.67 (0,415) 
36.1) 2 16.7 41.7 58.3 
4 33.3 91.7 83.3 
8 41.7 50.0 83.3 


Number of animals per group = 12 


*The coefficient (X:,) as subscript is the logarithm to the base 2 of the dose given, and is de- 


fined by 


Dose = 3.50 X 2%2p, 
e.g. 2.63 = 3.50 X 270-418, 
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By means of the transformation 
X} = X, + 0.585X, , where X? = -1,0,1, (A = 1), 
TABLE 10 


Analysis of variance of the data of Table 9 following the transformation X,* = 
X2 + 0.585X, , where X2 is the dose of oestradiol-3:178 and X;, is the time interval 


(method 3.12). 
| 
Source of variation D.f. Sum of Mean F P 
squares square 

Oestradiol-3:178 (2) (2195.5) 

Linear 1 2146.0 28.5 <0.001 

Quadratic 1 49.5 0.7 >0.05 
Time interval (2) (834.8) 

Linear 1 796.5 10.6 |0.01-0.001 

Quadratic 1 38.3 0.5 >0.05 
Frequency (2) (1414.3) 

Linear 1 132.0 1.8 >0.05 

Quadratic 1 1282.3 17.0 <0.001 
Interactions 20 1025.3 51.3 0.7 >0.05 
Theoretical variance ro) 75.3 


an analysis of variance (3.12) was made (Table 10) and regression 
coefficients were calculated, viz. 


Y — 56.34 = —6.65X, + 10.92(X, + 0.585X,) — 4.87(3X2 — 2),t 
or Y = 66.08 — 0.27X, + 10.92X, — 14.62X; . (8) 
In order to test the significance of b% its variance was computed: 
Vb¥ = 5.62 


It is obvious that b% is not significant, showing that the adoption 
of this design was unnecessary. The design was based on evidence 
from experiments with the related compound oestrone, and the differ- 
ence in behaviour of oestradiol-3:178 has since been confirmed (Biggers 
and Claringbold, 1954). 

The goodness of fit test for equation (8) showed that the model was 
satisfactory (x’11s; = 14.7, 0.7 > P > 0.5). 

Equation (8) was used to compute provisional estimates for an 


tThe £ functions (Fisher and Yates, 1948) have been employed for the calculation of the regression 
coefficient of the quadratic term. 


— 
= 
s We 


480 


BIOMETRICS, DECEMBER 1953 


TABLE 8 


Analysis of variance of the data of Table 6 following transformation X,* = X_ + 2X,, 
where X, is the dose of oestrone and X, is the time interval (method 3.11). 


Source of variation D.f. Sum of Mean F P 
squares square 
Doses (3) (1910.3) 
Linear 1 883.7 44. <0.001 
Quadratic 1 3.1 0. >0.05 
Cubic 1 23.5 0. >0.05 
Time interval 3 254.5 84.8 z: >0.05 
Interactions 9 191.0 21.2 0. >0.05 
Theoretical variance © 41.0 


TABLE 9 


gated with each factor at three levels. The design and results of the 
experiment are shown in Table 9. 


The effect of time interval (X,), frequency of injection (X3) and dose of oestradiol- 
3:178(X2) on the percentage vaginal response of ovariectomized mice. 


Time Frequency of Dose of oestradiol-3:178 (10~* ug.) 
Interval injection in 
(hours) time interval 2. 63-0. 415) 5. 25.0. 585) 10. 50 1.585) 
16-1) 2-1) 50.0 66.7 75.0 
4.0) 83.3 100.0 100.0 
8a) 58.3 66.7 83.3 
1 75-1) 3.50.0) 7.00,1) 
24,0) 2 50.0 75.0 75.0 
4 66.7 66.7 100.0 
8 41.7 75.0 83.3 
17 ,~1.585) 2.33 4 67 (0.415) 
36,1) 2 16.7 41.7 58.3 
; 4 33.3 91.7 83.3 
8 41.7 50.0 83.3 


fined by 


Number of animals per group = 12 
il *The coefficient (X:,) as subscript is the logarithm to the base 2 of the dose given, and is de- 


Dose = 3.50 X 2%2p, 
e.g. 2.63 = 3.50 XK 270-418, 
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By means of the transformation 
X}¥ = X, + 0.585X, , where X? = -1,0,1, (A = 1), 
TABLE 10 


Analysis of variance of the data of Table 9 following the transformation X,* = 
X, + 0.585X, , where Xz is the dose of oestradiol-3:178 and X;, is the time interval 


(method 3.12). 
| 
Source of variation Ds. Sum of Mean F P 
squares square 

Oestradiol-3:178 (2) (2195.5) 

Linear 1 2146.0 28.5 <0.001 

Quadratic 1 49.5 0.7 >0.05 
Time interval (2) (834.8) 

Linear 1 796.5 10.6 |0.01-0.001 

Quadratic 1 38.3 0.5 >0.05 
Frequency (2) (1414.3) 

Linear 1 132.0 1.8 >0.05 

Quadratic 1 1282.3 17.0 <0.001 
Interactions 20 1025.3 51.3 0.7 >0.05 
Theoretical variance © 75.3 


an analysis of variance (3.12) was made (Table 10) and regression 
coefficients were calculated, viz. 


Y — 56.34 = —6.65X, + 10.92(X, + 0.585X,) — 4.87(3X2 — 2),t 
or Y = 66.08 — 0.27X, + 10.92X, — 14.62X}. (8) 
In order to test the significance of b% its variance was computed: 
= 5.62 


It is obvious that b% is not significant, showing that the adoption 
of this design was unnecessary. The design was based on evidence 
from experiments with the related compound oestrone, and the differ- 
ence in behaviour of oestradiol-3:178 has since been confirmed (Biggers 
and Claringbold, 1954). 

The goodness of fit test for equation (8) showed that the model was 
satisfactory = 14.7, 0.7 > P > 0.5). 

Equation (8) was used to compute provisional estimates for an 


tThe £ functions (Fisher and Yates, 1948) have been employed for the calculation of the regression 
coefficient of the quadratic term. 
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TABLE 11 
Analysis of variance of the data of Table 9 following transformation X.* = X2 + 
0.585X, , where X> is dose of oestradiol-3:178 and X, the time interval (method 3.11). 


| | 
| 


Source of variation | D.f. | Sum of | Meun F P 
| | squares | square | 

Oestradiol-3:178 (2) | (2199.1) | 

Linear l | 2171.4 31.8 <0.001 

Quadratic 1 27 7 | 0.4 >0.05 
Time interval | (2) | (926.1) 

Linear aa 887.6 | 13.0 <0.001 

Quadratic 1 38.5 | 5.6 >0.05 
Frequency (2) | (1479.4) 

Linear 1 126.4 | is >0.05 

Quadratic |} 1 | 1353.0 | 19.8 <0.001 
Interaction | 20 | 1024.0 51.2 | 0.8 >0.05 
Theoretical variance 68.4 | 


exact iterative solution by method 3.11 (Table 11). After two cycles 
the following equation was obtained, 


Y = 66.36 — 0.60X, + 10.98X, — 15.02X? 
= 12.1, O09>P>08. 


6. CONCLUSIONS 


The comparisons which have been made in Sections 3 and 5 demon- 
strate that the two methods give similar answers. The advantages of 
the maximum likelihood solution lie in versatility (e.g. n need not be 
greater than 9 and need not be constant) and accuracy. The need for 
rapid methods of computation in routine and research laboratories has 
led to the development of many short-cut, but inefficient, approximate 
methods (for discussion see Berkson, 1950; Biggers, 1951; Finney, 
1952b), none of which have been extended to analyse multifactor de- 
signs. The results of the present paper show the ease and speed of 
solution by the analysis of variance of the observed transformed response 
provided the data are orthogonal. Furthermore, if the investigator 
requires an exact solution this analysis forms an excellent starting point 
for the iterative process. 

The analysis of variance has enormous advantages over the regres- 
sion methods in multifactor designs. First, tests of significance of 
treatment differences are all that are usually required and not the fitting 
of constants. Secondly, when four or more regression coefficients are 
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to be fitted the iterative procedures other than with the angular trans- 
formation are formidable (Finney, 1952a). 

In the second part of the paper it has been shown that the problem 
of extreme responses may be overcome by the choice of suitable designs. 
This seems preferable to the selection of transformations defined in the 
infinite interval and all of which lead to unequal weight per transformed 
observation, thus making the computational procedures laborious. 
In this regard the angular transformation has considerable advantages 
over both the probit and logit transformations. 

The designs described may be modified to many types of region of 
useful observation. They will be determined by the relationship between 
this region and the independent variables. For example, if a severe 
quadratic effect is predicted a transformation of the form 


X} = x, +c¢,X, + 


would be suitable. 

The use of these designs in the case of a continuous dependent 
variable can be imagined where the variance of an observation depends 
on the level of response. Also these designs may be used where certain 
regions are of no interest or can be neglected in the interests of economy 
of time or material. 

We wish to express our thanks to Sir Ronald Fisher, F.R.S., Dr. 
J. O. Irwin and Miss H. Newton Turner for helpful criticism of the 
manuscript. Also our thanks are due to Mr. T. Nalukowyj for his 
translation of the paper by Bliss (1937). 
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THE TRUNCATED POISSON DISTRIBUTION 
R. L. PLackettr 


University of Liverpool 


1. The Poisson distribution is defined by 
P, = Ne*/r! (r = 0, 1, 2, 


Examples have arisen in which this distribution is truncated because 
no observations are available (i) for r = 0, or (ii) for r greater than 
some specified value s. The truncated distribution (i), defined by 


P, = Ne/r\(1 — e) (r = 1, 2, 3, 


has been considered by David and Johnson in a recent issue of this 
journal [1]. In particular, they derive the maximum likelihood estimate 
of \ and its asymptotic variance, and discuss the efficiency of estimation 
by moments. Distribution (ii) has been studied by Moore [2], who 
provides a simple estimate of \ and shows the effect of truncation at 
different values s. The purpose of the present note is to provide a 
similar estimate of \ for distribution (i), to show that it is highly efficient, 
and to estimate its sampling variance. 

_2. Suppose that a sample of size N from a truncated Poisson dis- 
tribution of type (i) gives N, observations equal to r(r = 1, 2, 3, ---). 
To estimate an arbitrary function @(A), use quantities of the form 


6* = 2,N,/N, 

r=1 
and evaluate the unknowns z, by the requirement that 6* is to be un- 
biased. Thus 


/ri(1 — e) = O(a), 
r=1 
and x, is obtained as the coefficient of \’/r! when (e* — 1)@(A) is ex- 
panded in powers of A. 
3. First, consider the estimation of \. If 


zr /r! — 1) = > — DI, 


r@l 
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a comparison of coefficients gives 
x, = 0, = 2,73 = r(r > 2) 


The desired estimate is therefore 


A* = DorN,/N. 
r=2 
Example. <A factory employs about ten thousand workers but the 
exact number fluctuates with the amount of work available. During 
a certain period, the number of workers NV, having r accidents is as 
follows 


If these data are regarded as a sample of 2390 from distribution (i), then 
A* = 747/2390 = 0.3126 


On the other hand, the maximum likelihood estimate, i, is the solution 
of the equation 


> rN,/N = — &) 


r=l1 
The left side is 1.1657 giving } = 0.3149. 

4. The estimate \* can be regarded as the mean value of a sample 
of size N from a probability distribution with probability P, for the 
value 0 (observed N, times), P, for the value 2 (observed N, times), 
P, for the value 3 (observed N, times), --- If the variance of this 
probability distribution is o’, 

Var. A* = a /N 
and, in fact, 
Var. = [A+ 0° /( — DI/N. 
For the maximum likelihood estimate 
Var. i ~ — &€*)?/N(1 — &* — re’) 


This is an asymptotic result, whereas the expression for Var. A* is 
exact. Some numerical values for the two variances are given in the 
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following table, which also permits an assessment of the loss of informa- 
tion due to truncation (by comparing the second column with the third) 


TABLE 1. 
» N Var. X N Var. \* Efficiency 
(asymptotic) (exact) of \* 
0.5 0.8582 | 0.8854 0.9693 
1.0 1.5122 1.5820 | 0.9559 
1.5 2.0474 2.1462 0.9539 
2.0 2.5173 2.6261 0.9586 
2.5 2.9555 3.0589 0.9662 
3.0 3.3823 3.4716 0.9743 
3.5 3.8095 3.8814 0.9815 
4.0 4.2434 4.2985 0.9872 


The efficiency of \* is readily computed from the formula 
2 
(Var. X)/(Var. A*) = (1 — ); 
—e 


it tends to 1 as \ tends to zero or infinity, and never falls below 0.9536, 

the minimum value being attained when A = 1.355. In view of the 

ease with which \* can be calculated, there are good reasons for using it. 
5. Second, consider the estimation of Var. A*. If 


>> = — 1) Var. A* = AS — 1) 


r=] 


then y, = 0, y, = 4, = 3, > 3) 
An unbiased estimate of Var. \* is therefore 


(Var. \*)* = + 2N,)/N’. 
In the example given above, 

(Var. \*) * = (747 + 624) /2390’ = 0.0002400, 
corresponding to a standard error of 0.0155. This estimate of the 
variance of \* may be compared with the value 0.0002422 obtained by 
inserting \ = \* in the exact formula. Proceeding as before, 

Var. (Var. A*)* = [A + (7A? — 20°) /(* — 1) — — 1)?)/N%, 
whereas the variance of the quantity obtained by inserting A* for A 
in the expression for Var. \* is 


Var. (Var.* \*) ~ [1 + 2d/(e — 1) — A’e*/(e® — 1)?]? Var. A*/N? 
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Both variances are asymptotically equal to 8/N* as \ tends to zero, 
and to \/N° as X tends to infinity. Some other values are given below, 
and there appears to be some justification for using (Var. A*)* provided 
that \ is below 0.5 or above 4.0. Since the omission of the zero group 
may reasonably arise from the small value of \, a rapid estimate of 
Var. \* is worth mentioning. 


TABLE 2. 


N? Var. (Var.*A*) | N* Var. (Var. A*)* Ratio 
(asymptotic) (exact) 

0.5 2.1604 2.6637 8110 
1.0 2.4453 3.5712 6847 
1.5 2.2761 3.6673 6206 
2.0 2.1366 3.4862 6129 
2.5 2.1493 3.3054 6502 
3.0 2.3235 3.2492 7151 
3.5 2.6396 3.3545 7869 
4.0 3.0484 3.6033 8460 


6. There is, in general, no reason to suppose that A is the quantity 
to which special interest attaches, and the same method can be used 
to estimate any function of \ and its sampling variance, provided that 
these quantities can be expanded in power series. An investigation 
of the efficiency of the resulting estimate should always be made, 
however, for estimates as efficient as \* may be the exception rather 
than the rule. Thus, an unbiased estimate of e~* is found to be 


(e)* ((V, +N; +N; + soe] + Net 


but its efficiency, compared with e~*, is as low as 0.4994 when A = 0.5 
and rapidly decreases to zero with increasing X. 
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TABLES OF PEARSON-LEE-FISHER FUNCTIONS OF SINGLY 
TRUNCATED NORMAL DISTRIBUTIONS* 


A. C. Conen, Jr. AND JoHN WoopwarpD 


The University of Georgia 


Singly truncated normal distributions are encountered in numerous 
scientific investigations and with such frequency that it seems highly 
desirable to provide tables or other computational aids for lessening the 
labor involved in computing estimates of population parameters. In a 
distribution of the type considered here, the point of truncation, 2} , 
is assumed known and measurements are possible only for x’ > xj . 
No record is available either of count or measurement for z’ < 2. 
Although Pearson-Lee-Fisher estimators [1, 2, 3] for such distributions 
were first given by Pearson and Lee in 1908, tables adequate for routine 
estimation without considerable computational effort have not previously 
been available. One of the present authors [4] derived equations in 
1949 which permit calculation of P.L.F. estimates by a simple iterative 
process with the aid of ordinary tables of normal curve areas and ordi- 
nates. Thereby dependence on special tables was eliminated, but the 
problem of effectively reducing the labor incident to routine analyses 
of large numbers of samples from these distributions remained. Using 
the equations mentioned above, the present tables were prepared in 
an effort to alleviate this latter difficulty. 

The frequency function of a distribution of the type under considera- 
tion may be written as 


xr’ —m 
7’) = ——- exp —} Se, 


where m and o are respectively the population mean and standard 
deviation, ¢ is the truncation point in standard units of the complete 


*Sponsored by the Office of Ordnance Research, U. S. Army, under contract DA-01-009-ORD-288. 
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TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2m?2 
3 1/(Z — m2 /2m,? 1/(Z — m2/2m,? 
—4.0 | 0.2499 9164 | 0.5312 3118 —2.25 | 0.4381 8666 | 0.5889 6377 
—3.9 .2563 9720 | .5328 4429 —2.24 | .4399 7185 | .5895 5609 
—3.8 .2631 3768 | .5345 8231 —2.23 .4417 6893 .5901 5225 
—3.7 .2702 3924 .5364 5722 —2.22 | .4435 7797 .5907 5225 
—3.6 .2777 3056 | .5384 8215 —2.21 .4453 9905 .5913 5610 
—3.5 | 0.2856 4305 | 0.5406 7131 —2.20 | 0.4472 3224 | 0.5919 6380 
—3.4 .2940 1106 | .5430 4005 —2.19 .4490 7762 | .5925 7535 
—3.3 .3028 7213 .5456 0478 —2.18 | .4509 3528 | .5931 9077 
—3.2 .3122 6719 .5483 8291 —2.17 .4528 0528 | .5938 1004 
—B.1 .8222 4074 | .5513 9269 —2.16 | .4546 8771 .5944 3318 
—3.0 | 0.3328 4097 | 0.5546 5301 —2.15 | 0.4565 8266 | 0.5950 6023 
—2.9 .3441 1993 .5581 8315 —2.14| .4584 9014 | .5956 9105 
—2.8 -3561 3351 .5620 0245 —2.13 .4604 1030 | .5963 2579 
—2.7 .3689 4145 | .5661 2985 —2.12 | .4623 4820 | .5969 6440 
—2.6 -3826 0720 | .5705 8349 —2.11 .4642 8890 | .5976 0689 
—2.50 | 0.3971 9772 | 0.5753 8016 —2.10 | 0.4662 4750 | 0.5982 5324 
—2.49 | .3987 1031 .5758 7929 —2.09 | .4682 1906 | .5989 0346 
—2.48 4002 3291 .5763 8201 —2.08 | .4702 0366 | .5995 5755 
—2.47 .4017 6561 .5768 8833 —2.07 .4722 0139 .6002 1551 
—2.46 | .4033 0847 | .5773 9828 —2.06 | .4742 1231 .6008 7733 
—2.45 | 0.4048 6157 | 0.5779 1186 —2.05 | 0.4762 3651 | 0.6015 4302 
—2.44 4064 2497 | .5784 2909 —2.04 | .4782 7405 .6022 1257 
—2.43 4079 9876 | .5789 4998 —2.03 .4803 2503 | .6028 8598 
—2.42 4095 8299 .5794 7454 —2.02 | .4823 8952 | .6035 6324 
—2.41 4111 7776 | .5800 0278 —2.01 .4844 6759 .6042 4435 
—2.40 | 0.4127 8313 | 0.5805 3471 —2.00 | 0.4865 5932 | 0.6049 2930 
—2.39 4143 9917 | .5810 7035 —1.99 .4886 6479 | .6056 1810 
—2.38 4160 2597 | .5816 0970 —1.98 | .4907 8407 | .6063 1073 
—2.37 4176 6359 | .5821 5278 —1.97 | .4929 1724 | .6070 0719 
—2.36 4193 1211 .5826 9960 —1.96 | .4950 6438 | .6077 0746 
—2.35 | 0.4209 7160 | 0.5832 5017 —1.95 | 9.4972 2557 | 0.6084 1156 
—2.34 4226 4214 | .5838 0449 —1.94 | .4994 0087 .6091 1946 
—2.33 .4243 2380 | .5843 6257 —1.93 .5015 9020 | .6098 3091 
—2.32 | .4260 1666 | .5849 2443 —1.92} .5037 9414 .6105 1164 
—2.31 '4277 2080 | .5854 9007 —1.91 .5060 1226 | .6112 6591 
—2.30 | 0.4294 3629 | 0.5860 5950 —1.90 | 0.5082 4480 | 0.6119 8895 
—2.29 | .4311 6321 .5866 3273 —1.89 | .5104 9184 | .6127 1575 
—2.28 | .4329 0163 | .5872 0976 —1.88 | .5127 5346 | .6134 4630 
—2.27 | .4346 5162 | .5877 9061 —1.87 | .5150 2972 | .6141 8059 
—2.26 | .4364 1328 | .5883 7528 —1.86 | .5173 2071 .6149 1861 
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TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2mi*—Continued 
1/(Z — &) m2/2m,* 1/(Z — m2/2m,* 
—1.85 | 0.5196 2649 | 0.6156 6035 —1.45 | 0.6248 0613 | 0.6481 7579 
—1.84 .5219 4714 .6164 0578 —1.44 .6277 7842 | .6490 5333 
—1.83 .5242 8275 .6171 5491 —1.43 .6307 0521 .6498 4875 
—1.82 | .5266 3337 |. .6179 0771 —1.42 | .6337 7578 | .6508 1667 
—1.81 .5289 9908 | .6186 6418 —] 41 .6368 0090 | .6517 0233 
—1.80 | 0.5313 7996 | 0.6194 2429 —1.40 | 0.6398 4390 | 0.6525 9083 
—1.79 | .5337 7607 .6201 8803 —1.39 | .6429 0912 | .6534 8791 
—1.78 | .5361 8550 | .6209 5540 —1.38 | .6459 8319 | .6543 7554 
—1.77 | .5386 1431 .6217 2636 —1.37 .6490 7966 | .6552 7177 
—1.76 .5410 5658 | .6225 0090 —1.36 | .6521 9408 | .6561 7053 
—1.75 | 0.5435 1437 | 0.6232 7901 —1.35 | 0.6553 2650 | 0.6570 7178 
—1.74 .5459 8776 | .6240 6067 —1.34 | .6584 7696 | .6579 7552 
—1.73 | .5484 7682 | .6248 4581 —1.33 .6616 4553 .6588 8168 
—1.72 | .5509 8162 | .6256 3457 —1.32 | .6648 3225 | .6597 8524 
—1.71 .5535 0223 | .6264 2677 —1.31 .6680 3716 | .6607 0116 
—1.70 | 0.5560 3872 | 0.6272 2244 —1.30 | 0.6712 6031 | 0.6616 1440 
—1.69 .5585 9115 .6280 2156 —1.29 | .6745 0175 | .6625 2993 
—1.68 | .5611 5960 | .6288 2411 —1.28 | .6777 6152 | .6634 4771 
—1.67 | .5637 4414 | .6296 3008 —1.27 | .6810 3899 | .6643 6681 
—1.66 | .5663 4483 | .6304 3944 —1.26 | .6843 3624 | .6652 8477 
—1.65 | 0.5689 6175 | 0.6312 5218 —1.25 | 0.6876 5128 | 0.6662 1419 
—1.64 | .5715 9494 | .6320 6823 —1.24 | .6909 8483 .6671 4061 
—1.63 .5742 4449 .6328 8763 —1.23 | .6943 3692 | .6680 6397 
—1.62 | .5769 1046 .6337 1031 —1.22 | .6977 0760 | .6689 9959 
—1.61 .5795 9292 | .6345 3628 —1.21 .7010 9692 | .6699 3208 
—1.60 | 0.5822 9194 | 0.6353 6550 —1.20 | 0.7045 0490 | 0.6708 6652 
—1.59 | .5850 0757 .6361 9784 —1.19 | .7079 3159 | .6718 0286 
—1.58 | .5877 4333 | .6370 3834 —1.18 | .7113 7703 .6727 4108 
—1.57 | .5904 8893 | .6378 7240 —1.17 | .7148 4125 | .6736 8113 
—1.56 | .5932 5479 | .6387 1436 —1.16 | .7183 2428 | .6746 2297 
—1.55 | 0.5960 3753 | 0.6395 5945 —1.15 | 0.7218 2618 | 0.6755 6656 
—1.54 | .5988 3719 | .6404 0763 —1.14| .7253 4667 | .6765 1150 
—1.53 | .6016 5385 | .6412 5887 —1.13 .7288 8666 |. .6774 5885 
—1.52 | .6044 8757 .6421 1316 —1.12 | .7324 4532 | .6784 0746 
il .6073 3840 | .6429 7045 —1.11 .7360 2297 | .6793 5765 
—1.50 | 0.6102 0640 | 0.6438 3073 —1.10 | 0.7396 1964 | 0.6803 0941 
—1.49 | .6130 9165 | .6446 9396 —1.09 | .7432 3536 | .6812 6267 
—1.48 | .6159 9418 | .6455 6011 —1.08 | .7468 7016 | .6822 1740 
—1.47 | .6189 1407 | .6464 2915 —1.07 | .7505 2406 | .6831 7356 
—1.46 | .6218 5138 | .6473 0106 —1.06 | .7541 9711 .6841 3110 
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TABLE I: THE FUNCTIONS 1/(Z — &) AND m2/2mi2?—Continued 


1/(Z — &) 1/(Z — &) 
—1.05 | 0.7578 8932 | 0.6850 9000 —0.65 | 0.9215 0402 | 0.7240 7363 
—1.04| .7616 0140 | .6860 5107 —0.64| .9259 9527] .7250 5211 
—1.03 . 7653 3133 .6870 1166 —0.63 .9305 0610 .7260 3022 
—1.02 .7690 8119 .6879 7434 —0.62 .9350 3648 .7270 0792 
—1.01 | .7728 5031 | .6889 3821 —0.61 | .9395 8641 | .7279 8517 
—1.00 | 0.7766 3873 | 0.6899 0322 —0.60 | 0.9441 5588 | 0.7289 6193 
—0.99 . 7804 4645 .6908 6932 —0.59 .9487 4489 .7299 3817 
—0.98 | .7842 7351 .6918 3648 —0.58 | .9533 5341 | .7309 1385 
—0.97 | .7881 1992 | .6928 0466 —0.57 | .9579 8144 .7318 8892 
—0.96 .7919 8570 .6937 7381 —0.56 .9626 2896 . 73828 6336 
—0.95 | 0.7958 6454 | 0.6947 3584 —0.55 | 0.9672 9606 | 0.7338 3725 
—0.94 .7997 7546 .6957 1486 —0.54 .9719 8244 .7348 1019 
—0.93 . 8036 9947 .6966 8668 —0.53 .9766 8837 .7357 8250 
—0.92 .8076 4293 .6976 5930 —0.52 .9814 1373 . 7367 5403 
—0.91 .8116 0585 .6986 3269 —0.51 .9861 5852 . 7377 2474 
—0.90 | 0.8155 8824 | 0.6996 0680 —0.50 | 0.9909 2272 | 0.7386 9460 
—0.89 .8195 9013 .7005 8159 —0.49 | 0.9957 0531 . 7396 6233 
—0.88 .8236 1151 .7015 5703 —0.48 | 1.0005 0925 .7406 3160 
—0.87 .8276 5241 .7025 3306 —0.47 | 1.0053 3156 .7415 9869 
—0.86 .8317 1284 .7035 0964 —0.46 | 1.0101 7320 .7425 6478 
—0.85 | 0.8357 9280 | 0.7044 8674 —0.45 | 1.0150 3414 | 0.7435 2983 
—0.84 .8398 9231 .7054 6832 —0.44 | 1.0199 1488 . 7444 9383 
—0.83 | .8440 1138 | .7064 4232 —0.43 | 1.0248 1889 | .7454 5674 
—0.82 | .8481 4675 | .7074 1663 —0.42 | 1.0297 3264 | .7464 1851 
—0.81 .8523 0821 .7083 9946 —0.41 | 1.0346 7062 .7473 7912 
—0.80 | 0.8564 8599 | 0.7093 7852 —0.40 | 1.0396 2780 | 0.7483 3854 
—0.79 .8606 8335 .7103 5784 —0.39 | 1.0446 2089 .7493 1748 
—0.78 .8649 0030 .7113 3738 —0.38 | 1.0495 9966 .7502 5366 
—0.77 | .8691 3686 | .7123 1708 —0.37 | 1.0546 1429 | .7512 0929 
—0.76 | .8733 9299 .7132 9699 —0.36 | 1.0596 4802 | .7521 6361 
—0.75 | 0.8776 5898 | 0.7142 6476 —0.35 | 1.0647 2041 | 0.7531 4085 
—0.74 | .8819 6407 | .7152 5701 —0.34 | 1.0697 7268 | .7540 6815 
—0.73 .8862 7901 .7162 3708 —0.33 | 1.0748 6355 .7550 1831 
—0.72 .8906 1355 .7172 1713 —0.32 | 1.0799 7340 .7559 6702 
—0.71 | .8949 6769} .7181 9712 —0.31 | 1.0851 0222 | .7569 1426 
—0.70 | 0.8993 4143 | 0.7191 7701 —0.30 | 1.0902 4996 | 0.7578 5998 
—0.69 .9037 3477 .7201 5676 —0.29 | 1.0954 1661 . 7588 0418 
—0.68 | .9081 4770 | .7211 3635 —0.28 | 1.1006 0212 | .7597 4660 
—0.67 | .9125 8023 | .7221 1571 —0.27 | 1.1058 0647 | .7606 8785 
—0.66 .9170 3233 .7230 9481 —0.26 | 1.1110 2962 .7616 2726 
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TABLE I: THE FUNCTIONS 1/(Z — £) AND m2/2m:2?—Continued 


1/(Z — &) mM2/2m,? | 1/(Z — | m2/2m,2 
—0.25 | 1.1162 7155 | 0.7625 6503 0.16 | 1.3468 8124 | 0.7992 9404 
—0.24 | 1.1215 2100 . 7634 8720 0.17 | 1.3528 7561 | .8001 4178 
—0.23 | 1.1268 1158 .7644 3550 0.18 | 1.3588 8690 .8009 8698 
—0.22 | 1.1321 0962 .7653 6815 0.19 | 1.3649 1507 .8018 2964 
—0.21 | 1.1374 2629 .7662 9904 0.20 | 1.3709 6007 .8026 6975 
—0.20 | 1.1427 6157 | 0.7672 2816 0.21 | 1.3770 2184 | 0.8035 0728 
—0.19 | 1.1481 1541 .7681 5546 0.22 | 1.3831 0034 .8043 4224 
—0.18 | 1.1534 8776 .7690 8090 0.23 | 1.3891 9551 .8051 7460 
—0.17 | 1.1588 7863 .7700 0452 0.24 | 1.3953 0731 .8060 04386 
—0.16 | 1.1642 8794 .7709 2624 0.25 | 1.4014 3567 .8068 3151 
—0.15 | 1.1697 1567 | 0.7718 4605 0.26 | 1.4075 8055 | 0.8076 5603 
—0.14 | 1.1751 6177 .7727 6392 0.27 | 1.4137 4189 .8084 7792 
—0.13 | 1.1806 2621 .7736 7983 0.28 | 1.4199 1965 .8092 9715 
—0.12 | 1.1861 0895 .7745 9376 0.29 | 1.4261 1376 .8101 1374 
—0.11 | 1.1916 0995 .7755 0568 0.30 | 1.4823 2418 .8109 2765 
—0.10 | 1.1971 2917 | 0.7764 1558 0.31 | 1.4385 5085 | 0.8117 3889 
—0.09 | 1.2026 6656 .7773 2342 0.32 | 1.4447 9372 .8125 4745 
—0.08 | 1.2082 2209 .7782 2919 0.33 | 1.4510 5273 .8133 5331 
—0.07 | 1.2137 9571 .7791 3286 0.34 | 1.4573 2783 .8141 5648 
—0.06 | 1.2193 8739 . 7800 3443 0.35 | 1.4636 1897 .8149 5693 
—0.05 | 1.2249 8601 | 0.7809 2001 0.36 | 1.4699 2609 | 0.8157 5466 
—0.04 | 1.2306 2473 .7818 3111 0.37 | 1.4762 4914 .8165 4967 
—0.03 | 1.2362 7486 .7827 3189 0.38 | 1.4825 8806 .8173 4195 
—0.02 | 1.2419 3376 . 7836 1907 0.39 | 1.4889 4280 .8181 3149 
—0.01 | 1.2476 1505 .7845 0973 0.40 | 1.4953 1330 .8189 1828 

0.00 | 1.2533 1414 | 0.7853 9816 0.41 | 1.5016 9951 | 0.8197 0231 

0.42 | 1.5081 0138 .8204 8359 

0.01 | 1.2590 5667 | 0.7863 1656 0.43 | 1.5145 1884 .8212 6211 

0.02 | 1.2647 6550 . 7871 6822 0.44 | 1.5209 5184 .8220 3785 

0.03 | 1.2705 1768 . 7880 4982 0.45 | 1.5274 0033 .8228 1081 

0.04 | 1.2762 8747 .7889 2911 

0.05 | 1.2820 8904 . 7898 2392 0.46 | 1.5339 3484 | 0.8236 7304 

0.47 | 1.5403 4356 .8243 4840 

0.06 | 1.2878 7970 | 0.7906 8067 0.48 | 1.5468 3817 .8251 1301 

0.07 | 1.2937 0204 .7915 5291 0.49 | 1.5533 4806 .8258 7482 

0.08 | 1.2995 4180 .7924 2377 0.50 | 1.5598 7315 .8266 3383 

0.09 | 1.3053 9893 .7932 9024 

0.10 | 1.3112 7339 .7941 5529 0.6 1.6259 4816 | 0.8340 6925 

0.7 1.6934 8199 8412 2194 

0.11 | 1.3171 6513 | 0.7950 1791 0.8 1.7624 1805 .8480 9147 

0.12 | 1.3230 7409 .7958 7808 0.9 1.8326 9980 .8546 7938 

0.13 | 1.3290 0024 . 7967 3580 1.0 1.9042 7123 . 8609 889 

0.14 | 1.3349 4351 .7975 9104 

0.15 | 1.3409 0386 . 7984 4379 
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TABLE I: THE FUNCTIONS 1/(Z —§) AND m:/2m:*—Continued 


1/(Z — &) 


1/(Z — &) m2/2m? 


1.9770 7708 | 0.8670 245 
2.0510 6317 | .8727 922 


1.2 -7618 362 | 0.9139 42 
1.2 

1.3 | 2.1261 7654 | .8782 986 

1.4 

1.5 


.8449 813 .9174 80 


2 

2 

2.9288 166 .9208 44 
2.2023 6569 | .8835 513 3 
3 


2.2795 8069 | .8885 585 


.0133 080 .9240 43 
.0984 233 .9270 84 


aor 


2.3577 7330 | 0.8933 288 
2.4368 9703 | .8978 711 


1.6 .1841 318 | 0.9299 77 

1.8 | 2.5169 0714 | .9021 943 

1.9 

2.0 


.2704 044 .9327 27 


3 

3 

3.3572 134 .9353 42 
2.5977 6079 | .9063 078 3 
3 


2.6794 1689 | .9102 21 


-4445 327 .9378 30 
.5323 375 .9401 99 


distribution, or more precisely § = (xj — m)/o, and J,(é) is the fraction 
of the complete distribution retained after truncation; i.e. I(t) = 
S% dt, where g(t) = (1/-V/2n) exp The P.L.F. estimating 
equations are written as 


_1f_1-][_1__; 
(4) th = x, — 6, 


where >. x and )_ 2’ are respectively sums of the first and second powers 
of sample measurements about the truncation point (x = x’ — 2) 
and n is the number of sample observations > xj. Z is a function of 
defined as Z(t) = y(€)/Io(¢). The estimates obtained on solving these 
equations are maximum likelihood as well as moment estimates and 
accordingly are distinguished from the parameters estimated by the 
symbol (*). The notation employed here is essentially that of [4], and 
it differs in some respects from the notations of [1, 2, 3]. 

In practice, n )) 2*/2()0 x)’ is calculated from the sample data and 
equation (2) is solved for ~ which on substitution into (3) yields ¢. 
Equation (4) is then employed to determine 7h. 

To facilitate solution of equations (2) and (3), the accompanying 
tables of 1/(Z — £) and of m,/2m{ , where 


ame) *) 
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TABLE II: THE FUNCTIONS w’(E) AND 


Ww’ w’ p Ww’ w' p 
—3.0) 0.536 283 | 5.986 069 | 0.9114  —0.5| 2.439 898 |12.569 370 | 0.9145 
—2.9) 0.545 333 | 5.786 631 | 0.9078  —0.4) 2.692 714 |13.973 826 | 0.9188 
—2.8) 0.556 167 | 5.610 046 | 0.9043 —0.3) 2.975 044 |15.600 726 | 0.9232 
—2.7| 0.569 038 | 5.457 103 | 0.9008  —0.2) 3.289 968 |17.484 486 | 0.9275 
—2.6) 0.583 808 | 5.323 872 | 0.8974 —0.1) 3.640 849 |19.664 939 | 0.9316 
—2.5) 0.602 029 | 5.225 319 | 0.8943 0.0) 4.031 257 |22.187 540 | 0.9359 


4.465 167 |25.105 193 | 0.9400 
4.946 791 |28.478 115 | 0.9439 
5.480 680 |32.375 335 | 0.9477 
6.071 728 |36.875 669 | 0.9513 


706 637 | 5.084 354 | 0.8845 


or 


—2.0) 0.743 283 | 5.123 602 | 0.8831 6.725 176 |42.068 884 | 0.9547 
—1.9| 0.785 157 | 5.196 188 | 0.8821 
—1.8, 0.832 880 | 5.304 564 | 0.8816 0.6) 7.446 632 |48.056 996 | 0.9579 
—1.7| 0.887 141 | 5.451 691 | 0.8815 0.7| 8.242 087 |54.995 696 | 0.9610 
—1.6) 0.948 713 | 5.641 149 | 0.8820 0.8, 9.117 921 |62.895 815 | 0.9639 
0.9|10.080 921 |72.025 044 | 0.9666 
—1.5| 1.018 458 | 5.877 237 | 0.8830 1.011.138 290 |82.509 610 | 0.9691 
—1.4| 1.097 337 | 6.165 097 | 0.8845 
—1.3) 1.186 420 | 6.510 851 | 0.8865 1.1/12.297 666 | 94.536 317) 0.9714 
—1.2) 1.286 897 | 6.921 577 | 0.8889 1.2/13.567 116 |108.314 380) 0.9736 
—1.1} 1.400 090 | 7.406 409 | 0.8917 1.3)14.955 170 |124.077 852) 0.9756 
1.4/16.470 818 |142.087 792) 0.9775 
—1.0) 1.527 464 974 909 | 0.8949 1.5)18.123 532 |162.634 894| 0.9792 
—0.9| 1.670 639 639 143 | 0.8984 
—0.8) 1. 19.923 245 |186.041 823) 0.9808 


—0.7| 2.011 724 


10. 21.880 362 |212.665 989) 0.9823 
—0.6) 2.213 765 |11. 


24.005 980 |242.904 606) 0.9837 
26.311 135 |277.190 113) 0.9849 
28.808 163 |316.007 189) 0.9861 


7 
8. 
831 403 | 9.413 034 | 0.9022 
0 
1 


oo 


The assistance of Mr. Walter Lynch in performing many of the computations in- 
volved in preparing these tables is gratefully acknowledged. 


were compiled as functions of £ using a standard interval of 0.01. For 
values of & less than —2.5 and for those greater than 0.5, the interval 
is 0.1. Most entries are given to 8 decimals. The National Bureau of 
Standards W.P.A. Tables of Normal Curve Areas and Ordinates [5] 
served as a basis for the calculations. 

Weighting factors W’ and w’, obtained from the variance-covariance 
matrix for use in determining variances of ¢ and of ~ were computed 


| 
| 
—2.4| 0.6 
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at intervals of 0.1 on & Formulas used in these calculations are the 
following, which are given in [4]. 


1 — Z(Z — 
= 


To estimate the required variances, one has only to read W’ and w’ 
corresponding to § = & from Table II and evaluate 


Ww, ond ve =~! 


The coefficient of correlation p,,; between sampling errors of ¢ and é, 
also obtained from the variance-covariance matrix, has been tabulated 
to 4 decimals using the relation (cf. [4]) 


= 
Vil — 4Z — 8) 


Designating the argument as 7 rather than ¢, Sampford [6] recently 
published tables of Z, Z(Z — £&), and Z — &Z(Z — £) which in his 
notation are designated v, \ and ¢ respectively. Although useful for 
the purposes intended, Sampford’s tables fail to meet the need which 
prompted preparation of the present tables. The most useful tables 
previously available for the present purposes are those of Hald [7]. 
He gives £ as a function of m,/2m; at intervals of 0.001 but only to 
three decimals. He also tabulated 1/(Z — £) at intervals of 0.1 and 
gives elements of the variance-covariance matrix and the correlation 
coefficient p;,, at the same interval. He does not, however, give the 
variance weighting factors W’ and w’ explicitly. Hald’s notation is 
different from that employed by Sampford as well as from that used here. 
He designates m./2m; as y, § as z, and 1/(Z — &) as g(z). It is also 
noted that Hald’s correlation coefficient p;; relates to sampling errors 
between ¢ and 7, whereas the coefficient tabulated here relates to 
similar errors between ¢ and ¢. 


Pot 


An Illustrative Example. To demonstrate the practical use of these 
tables, it is convenient to choose an example previously employed to 
illustrate the iterative procedure of reference [4]. For this example, 
n = 37, >, x = 51.8600, >> x* = 98.0156, and x; = 0.850. Accordingly, 
we compute n >> x°/2(>_ x)’ = 0.67422043. Entering Table I with 
m,/2m? = 0.67422043, we immediately read ¢ = —1.16, which is 
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correct to the two decimals given. For greater accuracy, we interpolate 
linearly as summarized: 


m2/2m)? 
—1.1600 0.67462297 
—1.1643 0.67422043 
—1.1700 0.67368113 
Thereby, after very little effort we obtain = —1.1643 which is in 


exact agreement with the more laboriously obtained result of [4]. Direct 
interpolation gives 1/(Z — £) = 0.7168266, and from equation (3), 
é¢ = (51.8600/37)(0.7168266) = 1.0047. From (4), #2 = 0.850 — 
(1.0047)(—1.1643) = 2.020, and thus all estimates calculated here 
agree with the values obtained in [4]. To estimate variances of these 
estimates, we interpolate in Table II to read W’(—1.1643) = 1.337307, 
and w’(—1.1643) = 7.094662. Using these values and substituting 
the estimate ¢ = 1.0047 for the unknown go, the variances become 
V(é) = [(1.0047)?/37][1.337307] = 0.0365, and V(¢) = 7.094662/37 = 
0.1917. 
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THE USE OF MITSCHERLICH’S REGRESSION LAW IN 
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1. Introduction 


This paper summarizes, coordinates and extends a series of researches 
carried out by F. Pimentel Gomes, E. Malavolta, W. L. Stevens and 
I. R. Nogueira concerning the application of Mitscherlich’s law 


or y=at Bo" 


to the statistical analysis of experiments with fertilizers. The above 
equation was introduced by Eilh. Alfred Mitscherlich [2] in the study 
of fertilization of soils in Germany. Parameter A measures a maximum 
yield which could not be exceeded by the use of the fertilizer in con- 
sideration. Parameter c measures the efficiency of the fertilizer and 
b measures the soil content of the fertilizer in the control plots in a form 
assimilable by the plant. 

Mitscherlich carried out many experiments in Germany, most of 
them in pots kept in green-houses. The fitting of the curve was ob- 
tained with only two levels of the fertilizer, since a constant known 
value was attributed to c, always the same for each fertilizer. But 
Kletschkowsky and Shelesnow [1] proved that c is not constant, but 
varies with the amounts of other fertilizers. From that time on the 
equation has not been used much, partly because of the apparent 
difficulty of fitting it by satisfactory methods. Several authors have 
applied very crude processes of fitting, some of which have tended to 
bring the use of the curve into disrepute. But in the last few years 
advances have been made in the theory, so that the fitting of the curve 
can now be obtained quickly and accurately in most cases suitable for 
practical application, and efficient estimates may be computed for the 
variances of the estimates of the parameters. 


2. The Estimation of the Parameters 


If in an experiment 2, , x. , --- , x, are the amounts of a fertilizer 
used and y; , y2, °°: , y, are the yields obtained, then the estimation 
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of the parameters by the method of least squares should be carried out 
taking in account the function 


w= {y— All — 


of which the derivatives dw/dA, dw/db, dw/dc are to be equated to 
zero. The following equations are then obtained: 

- nA + Alo’ =0, 

(2.1) > — A > + = 0, 

> — A + >> = 0. 


Therefore we must have, by Rouché-Capelli’s theorem, 


This equation, obtained by Pimentel-Gomes and Malavolta [4], 
depends only on parameter c. Its solution, to get an estimate é of c, 


is generally very troublesome. However, we can always fix the amounts 
of fertilizer x, , x. , --- , , aS multiples of a number gq, that is, 


= = 1,2, ---,n), 


where m; is an integer. Now put 10°“ = z and we obtain 


Ly n 2" 
(2.3) > xyz” > | = 0. 
As c and q are necessarily positive, we must have 0 < z < 1, that is, 
the only root which should be obtained must lie between zero and one. 
Nogueira [3] proved that’ equation (2.3) has a zero of order three 
for z2 = 1 when the levels are supposed to be equally spaced. Therefore 
in this case it can be divided by (z — 1)*. This division diminishes 
its degree and provides a good check for the computations. The ex- 
pansion of the determinant leads to the equation 


(2.4) P,2) Sy + P22 dX + P32) Dd ye” = 0, 


where P,(z), P.(z) and P;(z) are polynomials in z. If we suppose that 
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p different levels, x, , x2, --- , 2» , were used and if r is the number of 
replications (pr = n), then equation (2.4) can be written as follows: 


(2.5) + §2[Pi(z) + + 


+ 9,[P,@) + + 2”’P3(@)] = 0, 


where 9; , J2 , *** , J are mean yields of the r replicates corresponding 
to each level. The coefficients of J, , J. , --- , J, in the last equation 
are also polynomials in z, and they can be all divided by (z — 1)* and 
by z. When this division is carried out, the following equation is ob- 
tained: 


(2.6) + G2J »2(2) pp (2) = 


The polynomials J,;(z) (¢ = 1, 2, --- , p) are always the same in 
comparable sets of experiments. They can be tabulated, therefore, 
for suitable values of p and for z between zero and one. The tables 
constructed by Pimentel-Gomes and Nogueira [7] for p = 5 are re- 
produced below, as well as new tables for the case of p = 4. This 
covers most of the cases to be met in practice, since for the case of 
p = 3 no tables are necessary and with less than three levels the fitting 
is not possible. 

When the tables are available, the solution of equation (2.6) be- 
comes easy. After obtaining a root z between zero and one, we compute 


(2.7) colog z log (1/z) 
q q 


If the equation has no root between zero and one, the fitting of the 
curve cannot be carried out. On the other hand, if two or more such 
roots were to be found, it would not be possible to choose among them. 
However, this case has never been observed in the applications. 

If we put in (2.3) 7. = 9% = --- = ¥, = K, it is easy to see that 
the determinant vanishes identically. Therefore from (2.6) it follows 
that 


+ + + J, (z) = 0, 


a relation which was useful for checking the tables for these polynomials. 
This identity shows also that if we subtract from the treatment means 
9: , 92, °** » J a constant number K, the remainders obtained satisfy 
also equation (2.6) and can be used to obtain the correct root. If 
K = 4%, , the first remainder is zero and equation (2.6) is changed to 
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which is easier to solve. 
After obtaining a root z, from which ¢ is calculated by (2.7), the 


values of A and 6 can be easily obtained from (2.1). For the special 
case of p = 4 we have 


— 2) + 91 — 2 


(2.9) A= J 

or 

(2.10) | 
+ Gs — +2 +2) | 


with P,(z) = (1 — z)(3 + 42 + 32°), and 
l+24+2?+2° 
4 — (1/A)(G. + + Gs + 


For the case of p = 5 we obtain 


(2.11) = (1/é) log 


are -2- 2) + 9,0 2) 
P5(2) + +2 — 2) + +2’) 
or 
(2.13) | 
+P 


where P,(z) = 211 — z)(2 +z + 22’), and 


2 3 4 
(2.14) 6 = (1/6) log —_+t2t2 
5 — (1/A)G + Go + + + 


If the usual assumption of homoscedastic normal distribution of 
the y’s is accepted, the method of least squares, which we have used, 
is equivalent to that of maximum likelihood. 

An equation similar to (2.8) can be obtained even with unequally 
spaced levels. 


3. The Analysis of Variance 


In the analysis of variance the sum of squares and the degrees of 
freedom corresponding to treatment effects should be split into two 
parts, one attributable to regression by Mitscherlich’s law and the 
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other to deviations from the regression curve. If we have p levels 
and r replications (pr = n), then the treatments sum of squares is 


where 7;(j7 = 1, 2, --- , p) are the treatment means, g is the grand 


mean and >,’ denotes summation over j. If g; is the yield at the 
jth level, computed by the regression equation, we have 


+2 — 9)G; — 9). 
From equations (2.1) it can be seen that the last term on the right 
of (3.1) vanishes, as shown by Pimentel-Gomes [5], who proved also 


that this is not true when the method of moments is used to estimate 
the parameters. Hence we obtain: 

It appears natural to assign degrees of freedom p — 3 to the first 
term and 2 to the second, both on the right hand side of the last relation, 
as done by Pimentel-Gomes [5] and Stevens [9]. Then the first term 
on the right of (3.2), divided by p — 3, provides an estimate for the 
variance of the deviations from the regréssion equation, which should 
not be significantly different from the residual variance if the regression 
curve fits the data well. The second term on the right of (3.2), divided 
by 2, gives an estimate, with two degrees of freedom, for the variance 
attributable to effects of regression by Mitscherlich’s equation, and 
should be significantly greater than the residual variance. 

As there are three parameters in the equation to be fitted, at least 
four levels of fertilization (the control included) should be used, in 
order that at least one degree of freedom should be left for testing the 
goodness of fit of the curve. But a large number of levels is generally 
not good, specially if factorial designs are used, for the number of 
treatments would increase excessively. It seems that in most cases 
4 or 5 levels are sufficient. However, if previous experiments have 
shown that the law is followed with reasonable accuracy in the case 
under consideration, its application can be carried out with only three 
levels. But the use of three levels without previous testing seems to 
be dangerous, as several examples are known where the law was un- 
successful to describe the increases in yield produced by the fertilizer. 


4. Steven’s Method 


In 1951 W. L. Stevens published an article [9] in which a new method 
of estimation of the parameters in Mitscherlich’s equation is introduced. 
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He uses the equation in the form y = a + Sp’ and supposes x = 0,1, --- , 
p — 1. We have, therefore, 


a= A, p= 10%, = 


The method of least squares leads now to a system of equations 
exactly equivalent to (2.1). Following Fisher’s method, Stevens 
takes a rough estimate r’ for r, substitutes 7’ + 6r for r in the normal 
equations and discards the terms where ér has exponent greater than 
one, assuming that dr is small. He then computes corrections for the 
rough estimates a’, b’, r’ and obtains the efficient estimates 


a=F,, Dyt+Fa yr? + F., >, 
b= Fa Dy t Fu + Fi, 
r’ + 


r 


where 


F.., Fas, For , Fx, and F,, being the elements of the reciprocal of the 
matrix 


n > r" > 
which were tabulated by Stevens [9]. 

However, if the preliminary estimate r’ (obtained, as Stevens sug- 
gests, by graphical interpolation or by other inefficient methods) is not 
good enough, the convergence is, sometimes, rather slow. It is more 
difficult, also, to obtain an upper bound for the error committed by 
stopping the iterative process at a certain stage, and the computations 
are more troublesome. In addition, the tables published include only 
values of z from 0.25 to 0.70 for the cases of 5 or 6 levels, and from 0.30 
to 0.75 for the case of 7 levels. If in a series of experiments some of the 
values of r fall within these limits and some outside of them, we are 
going to get in trouble if we want to use Mitscherlich’s law in the analy- 
sis of all of them. This difficulty, of course, can be solved by extending 
the tables to cover a wider range of values. 

Also, it sometimes happens that the preliminary estimate is outside 
the interval covered by the tables, while the true value is within it. 
This was the case, for example, in an experiment carried out by Dr. 
W. L. Nelson, of the Agronomy Department of the North Carolina 
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State College, in 1946. The fertilizer was superphosphate, at the 
levels of 0, 40, 80, 120 and 160 pounds per acre, and the crop was Irish 
potato. The mean yields obtained were as follows in pounds per 
plot of 1/65 of an acre. 


| | | 

Level | 0 1 2 | 3 | 4 

Yield {| 229.1 | 231.8 | 254.2 | 250.6 | 249.6 
| 


he 


A preliminary estimate could be obtained as follows, as suggested 
by Stevens: 


, _ 249.6 — 231.8 
250.6 — 229.1 


= 0.828. 


Another rough estimate could be obtained by taking 


+ 250.6 + 249.6 
3 


— b = 251.5 — 229.1 = 22.4, 


nn 251.5 — 231.8 
22.4 


If we try r’ = 0.70, then we obtain 6r = —0.333 and an improved 
estimate is 


= 251.5, 


= 0.879. 


r = 0.70 — 0.333 = 0.367 = 0.37. 


This value is next corrected to 0.667 = 0.67 by the same method. 
A third iteration gives r = 0.455. 

The true value lies between 0.57 and 0.58. We see, therefore, 
that in this case the convergence is rather slow if Stevens’ method is 
used. 

Assuming that the tables are extended, as suggested above, another 
difficulty to be found in the applications can be exemplified by the 
present ease. In fact, there is no certainty, from the theory, that the 
iteration process wili converge [10, 11]. Examples can be built up 
where r’ + 6r will be farther from the true root than r’, even if this is 
not usually the case. But one should think, maybe, that these ‘‘path- 
ological” cases do not happen in practice. The above data show that 
this is not true. In fact, if we take r’ = 0.80 as a preliminary estimate, 
we shall find 6r = 1.1, which is clearly absurd, since the true root is 
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between 0.57 and 0.58 and r cannot exceed one. It would be possible 
to avoid this difficulty by the application of the “damped least squares”’ 
[10], but then the computational work would be heavier. 


5. The Economic Aspect of Fertilization 


The use of Mitscherlich’s equation, when appropriate, allows 
suitable consideration of the economic factors behind any experiment 
on fertilizers. The most important criticism against its use in this 
sense is that it does not take in consideration the interactions between 
fertilizers. This criticism can be answered by the following statements: 

I) In experiments with fertilizers the interactions are usually rather 
unimportant for a reasonably wide range of the amounts of the nutrients; 

II) If the experiment shows significant interactions, different equa- 
tions should be worked out for each level of the other fertilizers under 
consideration, the general approach to be described below can be used 
for each case, and the best solution selected; 

III) If the experiment does not allow proper consideration of the 
interactions (which is not unusual, partly due to the acceptance of 
statement I) then this method gives the most profitable level of ferti- 
lization under the conditions of the experiment, which is the best that 
can be done under the restrictions imposed by the data. 

Therefore we shall consider that the amounts of just one fertilizer 
are to vary, all others being kept fixed. If ¢ is the price of one unit of 
this particular fertilizer, an increment dx of its amount produces an 
increase dE in the expenses, being 


dE = it dz. 


From Mitscherlich’s equation we find that in the first year the 
increase of yield obtained will be 


dy, = A,c(1/log dz, 


where log denotes the common logarithm and e = 2.718 --- is the basis 
of natural logarithms. Hence the increase in the income in the first 
crop will be 


dI, = A,we(1/log dz, 


w being the price of one unit of the crop yield. But the fertilizer has a 
residual effect which will appear in the following years. The increase 
in the yield in the (¢ + 1)th crop, attributable to the last increment 
dx applied in the first year may be taken as 


dy; = A,cH,(1/log dz, 
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where H,(0 < H; < 1) is the fraction of the fertilizer applied in the 
first year still available to the plant. Hence we find 


dI, = A,weH ,(1/log dx. 
It seems reasonable to put 


1 
f 
f > 1 being a safety factor which also takes in account the interest 
paid on capital invested in fertilizers. If « < 2z* is the solution of 
(5.1), then we say that 2* is the most profitable level of fertilization, 
meaning that, under the conditions of the experiment, it is the highest 
amount to be used such that every dollar spent with fertilizers brings 
an increase in the yield large enough to cover safely this expense and 
the interests paid on it and give some profit. 

The solution of (5.1) will be usually rather difficult to obtain, 
except if some simplifying assumptions are made. It seems reasonable, 
for example, to substitute every A;(t = 1, 2, 3, ---) by an average 
value A. Also we may take 


(5.1) dI, + adl,+ + --: > dE, 


H; = 


with 0 < h < 1, even if this seems to be a pessimistic hypothesis, for 

when a fertilizer is applied to the soil the loss in the first and second 

year is large, but afterwards the losses are relatively small. However, 

in this case a pessimistic hypothesis seems to be better than an optimistic 

one, and, in addition, usually new applications of the fertilizer will be 

made in the following years, so that the losses will be high. 
Inequation (5.1) becomes now 


(5.2) Awe(1/f log e&)10-°*” S(f, h, x) > t, 
where 
S(f, h, x) = 
i=0 


It can be shown that a reasonable approximation to the sum of 
this series, which underestimates its value, is 


f 
7-1 
If this approximation is substituted in (5.2), we obtain 
(5.3) Aue > 10°°(f = "(log e), 
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hence 
(5.4) log te | = 
(f — hAjt loge 
where 
—/h)/f. 


If we take h = O, that is, if we assume that no residual effect is 
taken in consideration, then we find 


Awe 
(5.5) x* = (1/c) log lites b, 


a well known formula, which has been successfully applied in Brazil 
[4, 5, 8]. 

The minimum value of k is easily found to be 1 — (1/4f). Bra- 
zilian experts on fertilizers generally agreed that a value f = 1.5 would 
be suitable in most cases. Then the minimum value of k would be 


0.833, which is not very different from one. Hence a simpler and 
conservative formula for x* would be 


(5.6) z* = (1/c) log (Awe/t log e) — b, 
where we take k = f —h = 1. 

Formula (5.4) gives the amount of fertilizer x, = x* to be applied 
in the first year, assuming that no nutrient of that kind has been added 
for some years. But in the second year (or second crop) we should 


have an inequation similar to (5.4) with x substituted by x + hz*, so 
that the most profitable level would be now 


= — A). 


It is easily seen that all the following amounts of fertilizer to be 
applied are going to be equal to x*(1 — A) also. 


6. The Estimation of h 


The estimation of h can be carried out when the same experiment 
is repeated in the same plots in two or more successive years (or crops). 
Suppose that in the first year the Mitscherlich equation was 


If no further application of the fertilizer is carried out, we may 
assume that in the following year the appropriate equation will be 
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The estimation of c’ = ch and b’ = b,/h is carried out, of course, 
in the same way as that of c and b described in section 2 above. Hence 
6, 

é 

From the few experiments we have in which the application of this 
method was possible it seems that, if the soil had not received the 
fertilizer for some years before, b, and b. are essentially estimating 
the same thing. When this is true, a better formula for the estimation 
could be used and would give a value of h between é’/é and 6,/b’. 
However more experiments must be carried out and studied before 
we can advance further in this direction. 


If the fertilizer is applied to the plots, in the second year, in the 
same amounts as before, we have 


ce’ = c(1 +h), b’ = b,/(1 + h), 
so that 
h = é'/é —1. 
7. The Variance of the Estimates of the Parameters 


Stevens [9] showed that the variances and covariances of the esti- 
mates a, b, r of the parameters a, 8, p are given by the formulas 


V(a) V(b) F 8°, Vir) (F,,/b*)s’, 
Cov(a, b) = Cov(a, r) = (F,,/b)s*, Cov(b, r) = (F,,/b)s’, 


where s° is, as usual, the estimate of the residual variance, and F,, , 
F,, , ete. are rational functions in r, the same described in section 4. 
For the important cases of three and four levels these functions were 
not tabulated, but they are rather simple, so that the tabulation is 
not too important. In fact, for the case of three levels they are: 


F,, = (1-1 “(1+ 

Fy, = —(1 — + 3r° + 2’), 
F,,= 
Fy, = (1 — — 4r — 47? + 12r'), 


For the case of four levels we obtain: 
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= + 47? + 10r* + 47° + 7°), 

= —Q(r)(1 + + 9r* + 4r° + 37°), 

F,, = — + 2r + + 47° + + +7), 
Fy, = — 4r + 6r? — 12r° + 27r'), 


F,, = — + + + 9r'), 
F,, = Q@r)(1 — + — — — 3r°), 
where 1/Q(r) = 2(1 — r)*(1 + 2r + 4r° + 27° +r‘). 

In all these functions r is equal to the root z in section 2. 

To avoid confusion between the estimate b of 8 and the parameter 
b which appears in the original Mitscherlich equation, we shall denote, 
in this section, from now on, the estimate of B by 8. 

Estimates for the variances of A, 6 and é can be obtained also. 
In fact, since A = a, we have V(A) = V(a). Also, since r = 10°°*, we 
obtain 

dr = —10°°%(1/log e)q dé = —r(1/log e)q dé, 
and, as (1/log e) = 2.30, we have 
(2.3 rq)’ (2.3 rqB)? 


(7.1) = 
Now 
B= 
from which we obtain 
6 = (q/log log (—8/a), 


hence 
ats = (a/log —(1/0) da + (1/8) + ar}. 
Therefore 
(1/a)?V(a) + (1/8)?V(B) + wV(r) 
(72) V(b)= (2/a8) Cov(a, 8) — (2u/a) Cov(a, 7) 
+ (2u/8) Cov(8, r) 
or 


= —_ + + (u/B)"F,, 
(log 7) (2u/aB)F., + 
where u = log (a/— 8)/r log r. 
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8. Example of Analysis 


In an experiment on Irish potatoes carried out in the Tidewater 
station by Dr. W. L. Nelson, of the Agronomy Department of the 
North Carolina State College, in 1945, triple superphosphate was 
applied at the levels of 0, 40, 80, 120 and 160 pounds of P.O, per acre. 
Four replications, in randomized blocks, were used. The plot size 
was 1/65 of an acre, and gypsum was used to compensate for differences 
in calcium due to unequal amounts of superphosphate. Uniform 
application of 100 lb. nitrogen and 160 lb. potash was carried out. In 
1946 the same amounts of superphosphate were applied again to the 
same plots, and there was a further addition of 120 lb. nitrogen and 
200 lb. of potash, uniformly. The yields in the two crops are given 
below, in pounds per plot. 


P.O; (lb. /acre) 
Year 
0 40 80 120 160 
199.2 292.3 387.2 491.4 387.2 
1945 210.3 387 .2 418.8 504.5 629.2 
253.1 396.6 508.3 446.8 523.1 
268.0 400.2 508.2 523.1 506.5 
Subtotals 930.6 1476.3 1822.5 1965.8 2046.0 
103.0 193.5 215.5 205.5 217.0 
1946 110.5 188.5 227 :5 234.5 243.0 
103.5 194.0 201.0 206.5 227.5 
102.0 178.5 203.0 224.0 237.0 
Subtotals 419.0 754.5 847.0 870.5 924.5 


If we assume that equation (2.6) is multiplied by the number of 
replications, which is 4, we obtain for 1945 the equation 


930.67 ;,(z) + 1476.3J5.(z) + 1822.5/;,(z) + 1965.8J,,(z) 
+ 2046.0 ;5(z) 0. 
The equation corresponding to (2.8), which is easier to solve, is 


R(z) = 545.7J52(z) + 891.9J;3(z2) + 1035.2J5,(z) + 1115.4J;;(z) = 0 
The tables give us, for z = 0, 
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= —3, J53(0) = J;,(0) = J;;(0) = 1. 


Therefore we obtain: 


RO) = —(545.7)3 + 891.9 + 1035.2 + 1115.4 = 1405.4. 


For z = 1 we find in the same way: R(1) = —28347.5. Since 
R(O) and R(1) have opposite signs, we know that a root does exist 
between zero and one. Let us try, for example, z = 0.50, and let us 
take the values from the tables with just one decimal place to begin 
with. We find: 


R(O0.50) = (545.7)(—9.9) + (891.9)(—6.0) + (1035.2)(1.7) 


+ (1115.4)(8.5) 
= 486.91. 


Hence the root lies between 0.50 and 1.00. A few more trials show 
that R(0.53) = 105.26, R(0.54) = —37.32, all three decimal places 
in the tables being now used. The usual linear interpolation gives 
then z = 0.5374 as an estimate of the root, with an error surely smaller 
than 0.01, probably much smaller than this value. Now we find by 
(2.7) 


= 0.2374 0.006743, 


since g = 40 lb. P.O; per acre, in this case. 
The computation of A and 6 is now carried out by formulas (2.12) 
or (2.13), and (2.14), and the equation obtained is 


y = 539 1[1 


or 
y = 539.1 — 307.5(0.5374)" 
where xz’ = 2/40. From either of these equations the expected values 
(i = 1, 2, --- , 5) can be calculated. 
Treatments (Ib. PO; per acre) 0 40 80 120 160 


Expected mean yields (Ib. /plot) 231.56 | 373.84 | 450.29 | 491.38 | 513.44 


Observed mean yields (Ib. /plot) 232.65 | 369.08 | 455.63 | 491.45 | 511.50 
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The sum of squares corresponding to deviations from the regression 
curve can now be computed with the aid of the expected and observed 
mean yields, as follows: 


4[(231.56 — 232.65)? + (373.84 — 369.08)? + --- 
+ (513.44 — 511.50)?] = 224.52. 


The analysis of variance can now be completed with the results 
shown below. 


Degrees of Sum of Mean 
freedom squares square 


Source of Variation 


Blocks 25 , 130.67 8376.9 


Regression by Mitscherlich law 208,274.09 | 104,137.0*** 
Deviations from regression 2 224.52 112.3 


(Treatments) (208 , 498.61) 
Residual 12 30,310.73 2525.9 


Total 263 ,940.01 


In the table above the three asterisks denote significance at the 
0.1% level of probability. 
Now, by linear interpolation in Steven’s tables, we obtain 


F,,(0.5374) = 3.1715, 


so that 


V(A) = (3.1715)(2525.9) /4 = 2002.72, 


hence s(A) = 44.75. We divide above by 4 because the curve was 
fitted to the means of 4 replicates. 
Similarly we obtain 


s(é) = 0.00259, 
s(b) = 30.94. 
For the 1946 data the same method of fitting led to the equation 


227.4(1 = 
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or 
y = 227.4 — 122.3(0.3355)”’, 


where x’ = 2/40. The new analysis of variance is given below. 


| | 
Source of Variation Degrees of | Sum of | Mean 

| freedom | squares | square 
Blocks 3 | 686.54 228.8 
Regression by Mitscherlich law 2 40,573.63 20 ,286.8*** 
Deviations from regression 2 202.05 101.0 
(Treatments) (4) (40,775.68) 
Residual] 12 952.52 79.38 
Total 19 42,414.74 

We have: 


é = 0.006743, 6, = 36.15, 


é’ = 0.01186, b’ = 22.72. 
so that 


~ 0.006743 


To compute z* we must first estimate the average maximum yield 
A. The yield in 1946 was much lower than in 1945, due to unfavorable 
meteorological conditions. The mean of the two maximum yields 
possibly is not, therefore, a good estimate of A. However, in the 
absence of better estimates, we may use it. So 


539.1 = 383.2 = 380. 


We find now, by (5.4), with f = 1.5, 


= (1/0.88)| 198 36.15 lb. P,O,/acre, 


where w is the selling price of one pound of potatoes and ¢ is the cost 
of one pound of P.O, under the form of triple superphosphate. 
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3 9. Tables of Polynomials 
z | Jule) Jale) | Sale) (2) 
0.00 | 0.000 —2.000 1.000 1.000} 0.000 —3.000 1.000 1.000 1.006 
0.01 | 0.020 —2.040 0.980 1.040] 0.031 —3.071 0.959 1.040 1.041 
0.02 0.041 —2.081 0.959 1.081 0.062 —3.144 0.916 1.080 1.085 
0.03 | 0.063 —2.123 0.937 1.123 | 0.096 —3.218 0.872 1.120 1.131 
0.04 | 0.085 —2.165 0.915 1.165] 0.130 —3.295 0.825 1.160 1.180 
0.05 | 0.108 —2.208 0.892 1.208] 0.166 —3.374 0.776 1.200 1.232 
0.06 | 0.132 —2.251 0.868 1.251| 0.204 —3.454 0.725 1.239 1.286 
0.07 | 0.156 —2.295 0.844 1.295 | 0.244 —3.537 0.671 1.279 1.343 
0.08 | 0.181 —2.340 0.819 1.340 | 0.285 —3.622 0.615 1.318 1.403 
0.09 | 0.207 —2.386 0.793 1.386 | 0.328 —3.709 0.557 1.357 1.467 
0.10 | 0.234 —2.432 0.766 1.432 | 0.373 —3.798 0.496 1.396 1.533 
0.11 | 0.262 —2.479 0.738 1.479 | 0.421 —3.890 0.432 1.435 1.603 
0.12| 0.290 —2.526 0.7 1.527 | 0.470 —3.984 0.365 1.473 1.676 
0.13 | 0.320 —2.575 0.680 1.575 | 0.522 —4.081 0.295 1.511 1.753 
0.14| 0.350 —2.624 0.649 1.624 | 0.576 —4.179 0.222 1.548 1.833 
0.15 | 0.382 —2.674 0.618 1.674| 0.633 —4.281 0.146 1.585 1.917 
0.16| 0.414 —2.724 0.585 1.725| 0.692 —4.385 0.067 1.621 2.005 
0.17 | 0.447 —2.776 0.552 1.777| 0.754 —4.492 —0.016 1.657 2.097 
0.18 | 0.482 —2.828 0.517 1.829| 0.819 —4.601 —0.103 1.692 2.193 
0.19 | 0.517 —2.881 0.482 1.882] 0.888 —4.713 —0.193 1.726 2.293 
0.20) 0.554 —2.934 0.445 1.936 | 0.959 —4.829 —0.288 1.760 2.397 
0.21| 0.591 —2.989 0.407 1.991 | 1.034 —4.946 —0.387 1.792 2.506 
0.22| 0.630 —3.044 0.368 2.046] 1.113 —5.067 —0.490 1.824 2.620 
0.23 | 0.670 —3.100 0.327 2.103 | 1.195 —5.191 —0.597 1.854 2.739 
0.24| 0.711 —3.157 0.285 2.160} 1.282 —5.318 —0.709 1.883 2.862 
0.25 | 0.754 —3.215 0.242 2.219] 1.372 —5.448 —0.826 1.911 2.991 
0.26 | 0.798 —3.273 0.198 2.278| 1.467 —5.581 —0.948 1.938 3.125 
0.27| 0.843 —3.333 0.152 2.338 | 1.566 —5.718 —1.075 1.962 3.265 
0.28 | 0.889 —3.393 0.105 2.399 | 1.670 —5.858 —1.208 1.986 3.410 
0.29| 0.937 —3.454 0.056 2.461 | 1.779 —6.001 —1.347 2.007 3.561 
0.30| 0.986 —3.516 0.006 2.524) 1.893 —6.147 —1.491 2.027 3.718 
0.31 | 1.037 —3.579 —0.046 2.588 | 2.013 —6.297 —1.641 2.044 3.881 
0.32] 1.089 —3.642 —0.099 2.653 | 2.1388 —6.451 —1.798 2.059 4.051 
0.33 | 1.142 —3.707  —0.154 2.719 | 2.270 —6.608 —1.961 2.072 4.228 
0.34| 1.197 —3.772 —O0.211 2.785 | 2.407 —6.768 —2.132 2.082 4.411 
0.35 | 1.254 —3.838 —0.269 2.853| 2.551 —6.933 —2.309 2.089 4.602 
0.36 | 1.312 —3.905 —0.329 2.922] 2.701 —7.101 —2.494 2.094 4.800 
0.37 | 1.372 —3.973 —0.391 2.992] 2.859 —7.272 —2.686 2.095 5.003 
: 0.38 | 1.434 —4.042 —0.454 3.063 | 3.023 —7.448 —2.886 2.093 5.218 
0.39 | 1.497 —4.112 —0.520 3.135 | 3.196 —7.627 —3.095 2.087 5.439 
a: 0.40| 1.562 —4.182 —0.587 3.208| 3.376 —7.811 —3.312 2.077 5.669 
0.41 | 1.628 —4.254 —0.656 3.282] 3.565 —7.998 —3.538 2.064 5.907 
0.42| 1.697 —4.326 —0.728 3.357 | 3.762 —8.189 —3.773 2.046 6.154 
0.43 | 1.767 —4.400 —0.801 3.434] 3.968 —8,385 —4.017 2.023 6.410 
0.44| 1.839 —4.474 —O0.876 3.511 | 4.184 —8.584 —4.271 1.996 6.676 
0.45| 1.913 —4.549 —0.954 3.590] 4.409 —8.788 —4.536 1.963 6.951 
0.46 | 1.989 —4.625 —1.034 3.669] 4.644 —8.995 —4.811 1.925 7.237 | 
0.47 | 2.067 —4.702 —1.116 3.750] 4.890 —9.207 —5.097 1.881 7.532 
0.48 | 2.147 —4.779 —1.200 3.832] 5.147 —9.423 —5.394 1.831 7.839 
0.49 | 2.229 —4.858 —1.286 3.916] 5.415 —9.643 —5.703 1.774 8.156 
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z Ja(z) J «2(z) J4a(z) J a(z) J s2(z) Jsa(z) J sa(z) Jus(z) 


313 —4.938 —1.375 4.000 


0. 2. 5.695 —9.867 —6.023 1.711 8.484 
0.51 2.399 —5.018 —1.466 4.086 5.988 —10.096 —6.357 1.640 8.825 
0.52 | 2.487 —5.099 —1.560 4.172 6.293 —10.328 —6.703 1.562 9.177 
0.53 2.577 —5.182 —1.656 4.260 6.611 —10.565 —7.063 1.475 .9.542 
0.54 2.670 —5.265 —1.755 4.350 6.943 —10.807 —7.436 1.380 9.919 
0.55 2.765 —5.349 —1.856 4.440 7.289 —11.052 —7.823 1.276 10.310 
0.56 | 2.862 —5.434 —1.960 4.532 7.651 —11.302 —8.226 1.163 10.714 
0.57 | 2.961 —5.520 —2.067 4.625 8.027 —11.556 —8.643 1.040 11.132 
0.58 3.063 —5.606 —2.176 4.719 8.420 —11.814 —9.076 0.906 11.564 
0.59 | 3.167 —5.694 —2.288 4.815 | 8.829 —12.076 —9.525 0.761 12.011 
0.60 | 3.274 —5.782 —2.403 4.912 9.255 —12.342 —9.991 0.605 12.474 
0.61 3.383 —5.872 —2.521 5.010 9.699 —12.613 —10.474 0.436 12.952 
0.62 3.494 —5.962 —2.642 5.110 | 10.161 —12.887 —10.975 0.255 13.446 
0.63 3.608 —6.053 —2.766 5.211 | 10.643 —13.166 —11.494 0.061 13.957 
0.64 3.725 —6.145 ° —2.893 5.313 | 11.144 —13.448 —12.033 —0.148 14.485 
0.65 | 3.845 —6.238 —3.023 5.417 | 11.666 —13.734 —12.590 —0.371 15.030 
0.66 | 3.967 —6.332 —3.156 5.522 | 12.208 —14.024 —13.168 —0.610 15.593 
0.67} 4.091 —6.427 —3.293 5.628 | 12.773 —14.318 13.766 —0O'865 16.176 
0.68} 4.219 —6.522 —3.433 5.736 | 13.361 —14.616 —14.386 —1.136 16.777 
0.69 | 4.349 —6.619 —3.576 5.845 | 13.972 —14.917 —15.027 —1.425 17.398 
0.70 | 4.482 —6.716 —3.722 5.956 | 14.607 —15.221 —15.691 —1.733 18.039 
0.71 4.618 —6.814 —3.872 6.068 | 15.267 —15.529 —16.379 —2.060 18.700 
0.72 | 4.757 —-6.913 —4.026 6.182 | 15.954 —15.840 —17.090 —2.407 19.384 
0.73 | 4.899 —7.013 —4.183 6.297 | 16.667 —16.154 —17.826 —2.775 20.089 
0.74 5.044 —7.113 —4.343 6.413 | 17.408 —16.471 —18.587 —3.166 20.816 
0.75 5.191 —7.215 —4.508 6.531 | 18.177 —16.791 —19.374 —3.579 21.567 
0.76 5.342 —7.317 —4.676 6.651 | 18.977 —17.114 —20.188 —4.016 22.342 
0.77 5.496 —7.420 —4.848 6.772 | 19.806 —17.439 —21.030 —4.478 23.141 
0.78 5.654 —7.524 —5.024 6.894 | 20.668 —17.766 —21.900 —4.967 23.965 
0.79 5.814 —7.629 —5.203 7.018 | 21.562 —18.095 —22.800 —5.482 24.815 
0.80 5.978 —7.734 —5.387 7.144 | 22.490 —18.427 —23.729 -—6.026 25.692 
0.81 6.145 —7.841 —5.575 7.271 | 23.452 —18.759 —24.689 —6.599 26.595 
0.82 6.315 —7.948 —5.767 7.400 | 24.450 —19.094 —25.680 —7.204 27.527 
0.83 6.488 —8.056 —5.963 7.530 | 25.486 —19.429 —26.704 —7.840 28.487 
0.84 6.665 —8.164 —6.163 7.662 | 26.559 —19.765 —27.762 —8.509 29.477 
0.85 | 6.846 —8.274 —6.368 7.796 | 27.672 —20.102 —28.854 —9.214 30.497 
0.86 7.030 —8.384 —6.577 7.931 | 28.826 —20.440 —29.980 —9.954 31.548 
0.87 7.218 —8.495 —6.790 8.068 | 30.021 —20.777 —31.143 —10.732 32.631 
0.88 7.409 —8.606 —7.008 8.206 | 31.259 —21.114 —32.343 —11.548 33.746 
0.89 7.604 —8.719 —7.231 8.346 | 32.542 —21.451 —33.581 —12.406 34.895 


0.90} 7.802 —8.832 —7.458 8.488 | 33.871 —21.786 —34.858 —13.305 36.078 
0.91 8.004 —8.946 —7.690 8.631 | 35.247 —22.120 —36.175 —14.248 37.297 
0.92 8.210 -—9.060 —7.927 8.777 | 36.672 —22.453 —37.533 —15.237 38.551 
0.93 8.420 —9.175 —8.168 8.923 | 38.146 —22.783 —38.933 —16.273 39.843 
0.94] 8.634 —9.291 —8.415 9.072 | 39.672 —23.111 —40.376 —17.357 41.172 


0.95} 8.852 ~—9.408 —8.666 9.222 | 41.252 —23.436 —41.863 —18.493 42.541 
0.96 | 9.073 —9.525 —8.922 9.374 | 42.886 —23.758 —43.395 —19.681 43.949 
0.97 | 9.299 -—9.643 —9.184 9.528 | 44.576 —24.075 —44.974 —20.924 45.398 
0.98 | 9.528 —9.761 —9.451 9.684 | 46.324 —24.389 —46.600 —22.223 46.889 
0.99 | 9.762 —9.880 —9.723 9.841 | 48.131 —24.697 —48.275 —23.581 48.422 
1.00 | 10.000 —10.000 —10.000 10.000 | 50.000 —25.000 —50.000 —25.000 50.000 
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For the subsequent crops the most profitable level of application 
of superphosphate would be estimated by 


— h) = 0.242*. 
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THIRD INTERNATIONAL BIOMETRIC CONFERENCE 


PROCEEDINGS 


Hotel Grande Bretagne, Bellagio (Como), Italy. 
Tuesday, September 1, 1953. 


The Conference was convened by the President of the Society, Pro- 
fessor G. Darmois, shortly after 9a.m. He introduced Dr. C. Barigozzi, 
Vice-President of the Italian Region, and Professor A. Buzzati-Traverso 
of the Organizing Committee, who, in brief addresses, welcomed the 
Conference to Italy and to Bellagio on behalf of the Italian Govei::nent 
and of the Organizing Committee. Their welcoming greetings were 
followed by the Presidentiai Address of Professor G. Darmois, which 
was then summarized in English by Dr. Cavalli-Sforza. 

The Conference continued with a symposium on the first course in 
biometry (for the scientific program see pages 525-526), and at 12:30 p.m. 
with the first general business meeting. After calling the meeting to 
order, President Darmois appointed Drs. Maria-P. Geppert (Chair), 
Leopold Martin and Harold Hotelling as a Resolutions Committee to 
report at the closing business session, and the Conference elected Drs. 
J. W. Hopkins and A. Linder auditors of the accounts of the Organizing 
Committee. The President called upon Secretary Bliss to report on 
developments in the Society in the four years since the last international 
conference. The Secretary commented on the new Directory, then in 
press (see page 535), noted that the proceedings of the International 
Biometric Symposium in Calcutta in December, 1951, would soon be 
distributed to all members, and remarked on the many meetings organ- 
ized by the Regions and the National Secretaries for their members. 
He reviewed the negotiations with the American Statistical Association 
which led to the acquisition of Biometrics as the journal of the Society, 
and its growing importance under the able editorship of Professor 
Gertrude Cox. He closed with a report on the recent meetings in Nice 
of the I.U.B.S. (see page 535). Following announcements by Dr. 
Cavalli-Sforza the meeting adjourned. A scientific session on mathe- 
matical problems in genetics completed the program for the day. 
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Methodological problems in biometry, and biometry in immunology 
were the subjects of the scientific programs in the morning and afternoon 
of September 2nd. In the evening five exhibits were on display, and 
there was a meeting of the Council of the Society, attended by the follow- 
ing Council members and officers: Bliss, Cavalli-Sforza, Cochran, Cox, 
Darmois, Fieller, Finney, Fisher, Geppert, Gjeddeback, Hopkins, Irwin, 
Linder, Martin, Mather, Rao, Schwartz, van der Laan, and Yates. 
Among other decisions, it approved a constitutional amendment to 
change the current designation of “Vice-President”’ to ‘Regional Presi- 
dent”, and the appointment of National Secretaries for Brazil, Sweden, 
and Switzerland. 

Following the morning session of September 3rd on biometric meth- 
ods in agriculture, motor boats took members of the Conference and 
guests for an excursion on Lake Como, ending at Villa d’Este in Cernob- 
bio for tea and returning about dusk. In the evening the Azienda 
Autonoma di Soggiorno di Bellagio entertained the Conference at a 
reception and dance at the Lido di Bellagio. 

The program on September 4th opened with papers on functional 
relations in experimentation and on biometric problems in genetics. At 
noon a business meeting on Biometrics was chaired by Editor Cox. Its 
purpose was to discuss editorial policy, answer questions concerning 
procedure and management and to obtain suggestions for the improve- 
ment of the journal. The afternoon program of contributed papers was 
followed in the evening by a banquet and dancing at the Grand Hotel 
Villa Serbelloni. 

The morning program on September 5th concerned industrial appli- 
cations of biometry. At 11:30 Vice-President Barigozzi took the chair 
for the closing business session. He called first on Professor W. G. 
Cochran for a report of the Committee on the Teaching of Biometry, 
which follows. 


Report of the Committee on the Teaching of Biometry 


Two years ago, the committee distributed a questionnaire to mem- 
bers, in order to find out to what extent members are interested in 
problems of teaching and to obtain suggestions from members about the 
work of the committee. 

There was a great variety of answers to the question: ‘‘What can this 
committee do that would be useful to you?” The most common request 
was for the publication of the contents of good courses in biometry, as a 
guide to teachers of the subject. Other members would like lists of 
problems which might be used as exercises for the students, and some 
members pointed out the difficulty of finding interesting examples which 
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the teacher can use to illustrate the application of the common statistical 
techniques. 
Suggestions were also received for assistance in keeping up with 
developments in the field by the publication of bibliographic information 
or of articles summarizing recent advances; for the simplification of 
techniques so that they would be more easily understood by biologists; 
and for finding out what techniques are actually used by biologists, in 
order to make instruction more realistic. 
The first opportunity for a meeting of the committee occurred at the 
Third International Biometric Conference at Bellagio. At this meeting 
the committee discussed plans for future work and decided to pursue 
three lines of action. 
1. To assemble for distribution accounts by experienced teachers of 
the purpose, content and mode of instruction used in their courses in 
biometry, including mention of teaching devices which they regard as 
highly successful and of difficulties which they encounter. First priority 
will be given to courses in which the students come from a fairly broad 
range of subjects within biology and second priority to courses for medi- 
cal students. The committee is of the opinion that such accounts will 
be of considerable interest to persons engaged, or about to engage, in 
the teaching of biometry. 
2. To begin work on a survey of the present status of the teaching 

of biometry to biologists, in order to find out how many students in 
biology receive instruction in biometry, and to obtain some information 
about the content and level of the instruction. The committee recog- 
nises that careful planning and execution will be necessary to obtain 
sound information of this type, and that financial assistance may be 
required for the task. The committee proposes to attempt the survey 
first in Great Britain and Italy under the direction of members from 
these countries. 
3. The committee realises the need for short and inexpensive intro- 
ductory books on biometry, which will present basic ideas without undue 
elaboration of formulae. The committee will take such steps as it can 
to stimulate the writing of books of this kind and their translation into 
other languages where this would be useful. 
' Members of committee: C. I. Buiss, A. Buzzari-Traverso, W. G. 
Cocuran (Ch.), G. Darmois, K. MarHer. 


Dr. Frank Yates then reported that the Committee on the Standard- 
isation of Symbols, of which he was chairman, believed no recommenda- 
tions should be made at this time in view of projects pending in the L.S.I. 
and in other national and international organisations, and recommended 
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that the Committee be discharged. His recommendation was approved 
by the Conference. Dr. Geppert then presented in English the report 
of the Resolutions Committee, after which it was read in French by 
Dr. Martin. After discussion and amendment by the Conference, the 
following resolutions were adopted by acclamation. 

Réunis en Assemblée Générale de Cléture, les participants a la 3me 
Conférence Internationale de Biometrie: 

1. Remercient Monsieur le Président du Conseil des Ministres de la 
République Italienne de la bienveillance qu’il a manifestée en faveur de 
Vorganisation de la 3me Conférence Internationale de Biometrie qui 
vient de tenir ses assises 4 Bellagio. 

2. Remercient les Gouvernements, les Universités et les Instituts 
d’enseignement et de recherche qui ont envoyé des représentants au 
Congrés. 

3. Remercient le Professeur G. di Francesco, Recteur de la Université de 
Milan pour l’appui tant moral que materiél qu’il a apporté a la réalisation 
de la Conférence. 

4. Remercient |’Istituto Sieroterapico Milanese Serafino Belfanti ainsi 
que toutes les autres Institutions et Organisations privées de l’aide 
matérielle et financiére qu’ils ont consenti a l’organisation de la Con- 
férence. 

5. Remercient les Membres du Comité Organisateur et les Présidents des 
Sections dont le travail a été trés fructueux. 

6. Remercient les Membres du Comité Executif et en particulier Mes- 
sieurs les Prof. Barigozzi et Buzzati-Traverso et les félicitent de l’organ- 
isation impeccable tant des conditions du travail que des divertissements 
des congressistes et de leur famille. Ils applaudissent spécialement le 
Dr. et Mme. Luigi Cavalli-Sforza dont le dévouement inlassable a été 
manifesté en toute occasion. 

7. L’Assemblée Générale émet a l’unanimité la résolution suivante: 

Vu Vimportance toujours grandissante des méthodes biometriques, 
tant dans l’organisation de la recherche biologique pure et appliquée que 
dans |’analyse des résultats experimentaux obtenus, vu |’importance 
sociale et économique des résultats des recherches en biologie appliquée, 
l’Assemblée Générale émet le voeu de voir les gouvernements et les 
Universités établir sur base organique l’enseignement de la Biometrie au 
sens large, c’est a dire des aspects statistiques et mathématiques de la 
biologie pure et appliquée. 

L’ Assemblée prie l’Union Internationale des Sciences Biologiques de 
transmettre ce voeu a l’U.N.E.S.C.O. 

Vu la grande importance scientifique et culturelle de cette résolution, 
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’Assemblée demande instamment a l’U.N.E.S.C.O. de prendre en con- 
sidération ce voeu et d’en faire part aux divers gouvernements. 

8. L’Assemblée Générale recommande une liason active et une coopéra- 
tion plus étroite entre la Biometric Society, y comprises ses organisations 
régionales et nationales, avec les grandes organisations internationales, 
telles que l’Organisation Mondiale de la Santé (O.M.S.), la F.A.O., 
V1I.S.0. (Organisation Internationale de Standardisation), ainsi que 
l'Institut International de Statistique (I.S.I.). L’Assemblée insiste sur 
la nécéssité de l’application des principes biometriques a la solution des 
problémes recontrés tant a l’échelle nationale qu’internationale. 

9. Enfin, reeommande que dans chaque pays s’exerce une meilleure coor- 
dination des efforts en vue de la diffusion de la connaissance et de |’appli- 
cation des methodes biometriques dans des cercles plus larges. 

En vue de concrétiser ces projets |’ Assemblée Générale propose dans 
un avenir prochain |’institution en Europe continentale de bréves session 
de vacance consacrées a la mise au point des problémes biometriques 
d’actualité. 

The auditors, Drs. Hopkins and Linder, reported that an examination 
of the finances of the Organizing Committee showed all funds to be 
accounted for and in good order, a report which the Conference adopted 
unanimously. The Secretary announced that the Council had given its 
tentative approval to holding the Fourth International Biometric Con- 
ference in Canada in 1958, and an International Symposium in Brazil 
in 1955. There being no new business, the Conference closed with a 
brief address by Dr. Barigozzi. 
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PRESIDENTIAL ADDRESS 
Dignités nouvelles de la Statistique dans la Recherche. 


3 PAR GEORGES DARMOIS 
Université de Paris 


Je voudrais d’abord remercier tous les dévoués organisateurs de cette 
Conférence. Chacun sait de combien de travaux divers, de solutions de 
difficultés renaissantes, imprévues, se compose une tdche de cette en- 
vergure. 

C’est quelque chose d’analogue 4 un mariage ov il faut penser a tout. 

Tout d’abord, c’est notre Secrétaire Général, le Docteur Cavalli- 
Sforza qui a porté en souriant le fardeau de cette préparation. Ensuite 
tous nos présidents des diverses sections, tous les Savants qui ont médité 
des Communications, et ceux qui ont pensé aux discussions, aux inter- 
ventions. 

Je vous remercie donc tous, qui avez bien voulu venir, quelques uns 
d’assez loin, comme nos amis des Etats-Unis, de |’Inde, pour nous amener 
le résultat de vos travaux, et de votre expérience. 

Dans cet admirable décor fourni par la nature avec tant de bonheur, 
choisi avec tant de godt, pour notre conférence, vous allez nous montrer 
tout la vitalité, et toute la force jeune et active de la Société de Biometrie. 

C’est justement de ces forces jeunes que je voudrais vous dire 
quelques mots. 

J’ai parlé, dans le titre, des dignités nouvelles de la Statistique dans 
la recherche. Bien entendu, ces dignités ne sont pas nouvelles pour vous 
qui les exercez, mais j’ai jugé utile de faire le point de la situation, qui 
évolue d’ailleurs assez vite. 

Ce que j’ai voulu dire, c’est que le statisticien intervient maintenant 
beaucoup plus tét, et qu’au lieu d’étre associé tardivement 4 |’exploita- 
tion des résultats, il a un réle qui commence avant les observations. 

I] n’y a pas tellement longtemps, on |’appelait dans les cas obscurs, 
ou le raisonnement ne paraissait pas réussir. 
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La Statistique, a dit autrefois Adolphe Thiers, est l’art de préciser les 
choses qu’on ignore. C’est ainsi qu’on présentait au statisticien un amas 
informe d’observations, faites sans but bien précis, et on lui disait: 
“Apportez-nous des conclusicns Statistiques’. 

Il y a eu sur ce point une véritable révolution, et qui n’est qu’a ses 
débuts. Le Technicien qu’on appelait a la fin fait maintenant du ‘“De- 
sign of Experiment’’. On peut certes attribuer 4 Sir Ronald Fisher un 
role trés important dans l’impulsion donnée a cette idée, qu’il est bon, 
qu’il est nécessaire d’ organiser |’avance les observations pour en tirer au 
mieux des renseignements. 

Sous une forme générale je dirais que: ““Nous avons & chercher le 
meilleur chemin pour progresser vers la connaissance’’. 

Bien entendu nous devons toujours rester modestes et de bon sens. 
On fait appel 4 nos services, mais ne soyons pas trop autoritaires et 
tranchants dans les conseils que nous donnerons, aprés les avoir tirés de 
la théorie. 

Le plus court chemin en un terrain accidenté n’est pas toujours le 
meilleur. Il peut méme étre inutilisable, et trop de raffinement dans la 
recherche de la perfection ne doit pas nous éloigner de |’action. 

A titre d’exemple important on peut évidemment considérer comme 
un prolongement des idées de la planification des expériences des 
réussites telles que les méthodes séquentielles, ot le cheminement vers 
la connaissance se fait par un nombre de pas non fixe d’avance, et 
s’arréte quand on atteint le degré de connaissance qu’on s’était donné. 

Et ceci m’améne 4a faire intervenir explicitement dans ce chemine- 
ment optimum une notion récemment introduite et trés étudiée par les 
Spécialistes des Communications, la notion d’information. 

Prise dans son sens général, une information qui s’améliore, c’est une 
modification favorable de notre connaissance d’une question. C’est une 
information utile, quand on vient d’acheter un fusil, que d’avoir les 
résultats d’un tir, soit de la dispersion d’une ou plusieurs cartouches, 
soit de l’observation de 10 coups sur uncible. 

C’est une information qu’une moyenne, ou un “‘range’’. 

La théorie dont nous parlons a visé la définition d’une grandeur 
capable de coter, 4 chaque instant d’un processus, la connaissance at- 
teinte, ou la distance qui reste 4 parcourir. 

Fisher a défini, dans la théorie de |’estimation d’un paramétre, une 
certaine grandeur qu’il a appelée information. II s’agit ici d’une loi de 
probabilité de forme connue, dépendant d’un paramétre inconnu. Cette 
information est en vérité une capacité de la loi 4 renseigner sur le para- 
métre. Elle donne lieu 4 de remarquables théorémes. 

Sous le méme nom d’information, Hartley puis Shannon et Wiener 
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ont associé a la loi de probabilité d’un étre aléatoire une grandeur visant 
a coter la connaissance statistique de cette aléatoire par la loi considérée. 
C’est en fait une entropie. On avait cru comprendre que ces deux sortes 
d’informations avaient un lien étroit. Il n’en est rien, elles font seule- 
ment partie d’une certaine famille de grandeurs ayant une structure 
formelle analogue. 

Ce résultat a été établi par Schutzenberger, qui, par des considéra- 
tions que ne nous reproduirons pas, a défini cette famille de grandeurs 
comme la valeur moyenne d’une opération linéaire faite sur le logarithme 
de la probabilité. 

Pour Shannon et Wiener, c’est le logarithme lui-méme, et pour 
Fisher, |’opération linéaire est la dérivation seconde. 

Il peut d’ailleurs en exister d’autres, et la méthode séquentielle de 
Wald pour séparer les valeurs de deux par: .aétres emploie une informa- 
tion de cette nature. 

Schutzenberger, dans une thése récente a étudié ces questions et les a 
appliquées a la Méthode des Groupagés qui se présente, comme Dorfman 
l’a signalé le premier, dans le diagnostic d’affections de faible fréquence. 

L’emploi de |’information comme mesure du chemin parcouru vers 
la connaissance cherchée, permet l’emploi de bonnes et courtes méthodes 
pour parvenir au résultat. 

Signalons seulement que le diagnostic utilise |’information de Shan- 
non-Wiener, que l’estimation de la fréquence se sert de |’information de 
Fisher, le probléme de |’extraction ou du tri d’individus d’une catégorie 
emploie l'information de Wald. 

Je ne veux pas insister sur les détails, malgré leur importance. 

J’ai voulu seulement signaler que, 4 cété des applications déja si 
nombreuses de la planification des expériences, des résultats nouveaux, 
et des perspectives nouvelles s’ouvrent par l’emploi de |’information, 
utilisée jusqu’ici presque exclusivement dans les problémes de Communi- 
cation. 

Rien n’est plus naturel au fond, puisque, comme nous I’avons dit, il 
s’agit de progresser par l’expérience, vers la Connaissance d’une solution. 

Les derniers résultats dont je vous ai parlé ont été obtenus dans 
l'étude d’un probléme de Biométrie; la méthode des groupages s’ap- 
plique 4 bien d’autres questions, et sans doute a la recherche industrielle, 
mais je suis persuadé que la biométrie offre un champ particulitrement 
vaste au probléme général du’cheminement optimum. 
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THIRD INTERNATIONAL BIOMETRIC CONFERENCE 


SCIENTIFIC PROGRAM 


(A book of abstracts of the papers presented at the Con- 
ference was distributed to each participant. It is planned 
to reprint this book and send copies to all members of the 
Biomeiric Society. Each meeting was organized by the 
member of the Organizing Committee who served as its 
chairman.) 


September 1st. 9:30a.m. Presidential address by G. Darmois. (see 
page 522). 

10 a.m. The First Course in Biometry—a Symposium. Chairman: 
W. G. Cochran. L. Martin—Enseignement des principes d’expérimen- 
tation et des méthodes statistiques 4 des biologistes dans deux établisse- 
ments belges d’enseignement supérieur. G. Barbensi—L’insegnamento 
della biometria. C. I. Bliss—A course in biometry for graduate students 
in biology. A. Vessereau—Enseignement des méthodes statistiques 
appliquées 4 la biometrie. 

3 p.m. Mathematical Problems in Genetics—I. Chairman: A. 
Buzzati-Traverso. Sir Ronald Fisher—The variability in the length of 
germ plasm still heterogeneous after a given amount of inbreeding. 
K. Mather—The methodology of biometrical genetics. D. Lowry— 
Variance components with reference to genetic population parameters. 
J. L. Lush—Estimating heritabilities. 

September 2nd. 9 a.m. Methodological Problems in Biometry. 
Chairman: Gertrude Cox. J. W. Hopkins—Some needed significance 
tests. F. Anscombe—Fixed-sample-size analysis of sequential observa- 
tions. W. G. Cochran—The combination of estimates from different 
experiments. M. Keuls—Testing differences between means in an 
analysis of variance. M. J. R. Healy—Decision between two alterna- 
tives; how many experiments? C.R. Rao—A general theory of discrimi- 
nation when information about alternative population distributions is 
based on samples. By title L. Martin—Suggestions for longitudinal 
data in gerontology. 

3 p.m. Biometry in Immunology. Chairman: G. Rasch for H. C. 
Batson. R. Prigge—Die Anwendung der Mutungsbereiche in der Im- 
munitatsforschung. J. Ipsen—Factors of dosage and host determining 
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antibody response to secondary antigen stimulus. L. B. Holt—Quanti- 
tative studies in diphtheria prophylaxis: an attempt to derive a mathe- 
matical characterization of the antigenicity of diphtheria prophylactics. 
S. Peto—A dose response equation for the invasion of microorganisms. 

9:30 p.m. Exhibits. E. Morice—A statistical study of the phe- 
nomena of human growth. E. Olbrich—Nomographic tables with whole 
numbers and the estimation of errors in anthropological studies. 8. C. 
Pearce and G. H. Freeman—Repeated changing of treatments in trials 
with long-lived species. J. M. Tanner—Size, shape and regional differ- 
ences in the vertebral columns of inbred strains of rabbits, studied by 
covariance. J. Dufrenoy and F. M. Goyan (presented by D. Schwartz)— 
A graphical calculator for statistical analysis. 

September 3rd. 9 a.m. Biometric Methods in Agriculture. Chair- 
man: P. V. Sukhatme. F. Yates—The place of simple experiments on 
cultivators’ fields in agricultural development. V.G. Panse—Principles 
of the survey method of experimentation. T. N. Hoblyn and S. C. 
Pearce—Some considerations in the design of successive experiments on 
fruit plantations. G. Rasch—On different sources of errors and the 
advantage of their knowledge in planning experiments. 

September 4th. 9 a.m. Functional Relations in Experimentation. 
Chairman: H. Wold. D. J. Finney—Functional relationships in experi- 
mentation. J. Berkson—Minimum chi square and maximum likelihood 
estimates of regression coefficients. 

11 a.m. Mathematical Problems in Genetics—II. Chairman: F. 
Yates. A. R. G. Owen—Experimental designs in genetics. C. A. B. 
Smith—The calculation of correlation between cousins. 

3 p.m. Contributed Papers. Chairman: M. J. R. Healy. A. F. 
Parker-Rhodes—Estimating populations of irregularly observable or- 
ganisms. D. W. Goodall—Factor analysis in plant sociology. G. 
Karreman—The mathematical biology threshold and related phenom- 
ena in excitation. M. Fréchet—Réhabilitation de la notion statisique 
de homme moyen. G. Teissier—Sur la determination de |’axe d’un 
nauge rectelyne de points. (read by G. Darmois). G. Karreman—The 
mathematical biology of threshold and related phenomena in excitation. 
M. W. Bentzon—On the statistical evaluation of dose response curves 
in case the dose intervals are large. 

September 5th. 9a.m. Industrial Applications of Biometry. Chair- 
man: A. Linder. E. A. G. Knowles—Applications of experimental de- 
signs in industry. D. R. Read—The design of chemical experiments. 
H. C. Hamaker—Experimental designs in industry: a discussion. 

12 noon. Closing remarks by C. Barigozzi, Vice President of The 
Biometric Society for the Italian region. 
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THIRD INTERNATIONAL BIOMETRIC CONFERENCE 


REGISTRATION 


The 125 participants in the Conference represented 24 different 
countries, with 101 of them members of The Biometric Society when 
the Directory went to press. Twelve were delegates representing 
governments, governmental departments, international organizations, 
educational institutions or academies of science. These individuals are 
starred in the following list of participants, arranged by countries: 
Argentine—J. R. Hérler;* Austria—E. Olbrich; Belgium—R. van den 
Driessche,* A. Lenger, L. Martin,* A. H. L. Rotti; Brazil—A. Grosz- 
mann; Canada—J. W. Hopkins*; Denmark—M. W. Bentzon, N. 
Gjeddebaeck, A. Hald*, G. Rasch; France—J. Arnoux, G. Darmois, 
D. Dugué, J. M. Faverge, R. Feron, M. Frechet, M. Lamotte, M. J. 
Laurent-Duhamel, E. Morice, B. Pélegrin, D. Schwartz, J. Ulmo, 
L. A. Vessereau; Germany—F. Bernstein, F. J. Geks, M. P. Geppert, 
J. Hartung, S. Koller, R. Prigge, H. Prébstel, D. Wichmann; Gold 
Coast—D. W. Goodall; Greece—B. G. Christidis; India—P. C. Maha- 
lanobis, V. G. Panse, C. R. Rao, P. V. Sukhatme; Italy—E. Baldacci, 
G. Barbensi, C. Barigozzi, L. Boretti, A. Buzzati-Traverso, L. L. 
Cavalli-Sforza, R. Ciferri, G. De Angeli, F. Frassetto, T. Gelsomini, 
V. Nozzolini, A. Palazzi, A. Previtera, R. Scossiroli, F. Sella; Malaya— 
D. R. Westgarth; Mexico—A. M. Flores*; Netherlands—D. van 
Dantzig*, E. F. Drion, J. D. Erlee, H. C. Hamaker, G. Hamming, 
J. Hemelrijk, G. van Iterson*, M. Keuls, E. van der Laan, C. A. G. 
Nass, D. J. Stoker; Norway—N. A. Barricelli, @. Nissen, P. Ottestad; 
Portugal—A. Tovar de Lemos; Spain—S. Rios Garcia*; Sweden— 
H. Bergstrém, N. Blomqvist, H. Wold; Switzerland—C. Blanc’, 
A. Kaelin, A. Linder*, E. M. Lourie; Turkey—O. Diizgiines; United 
Kingdom—F. J. Anscombe, M. 8. Bartlett, R. E. Blackith, R. O. 
Cashen, O. L. Davies, E. C. Fieller, D. J. Finney, Sir Ronald Fisher, 
M. J. R. Healy, A. Bradford Hill, 8. B. Holt, L. B. Holt, J. O. Irwin, 
E. A. G. Knowles, F. B. Leech, K. Mather, J. A. Nelder, A. R. G. Owen, 
A. F. Parker-Rhodes, S. C. Pearce, 8. Peto, D. R. Read, M. R. Sampford, 
C. A. B. Smith, J. M. Tanner, J. Wishart, F. Yates; United States of 
America—J. Berkson, C. A. Bicking, C. I. Bliss, W. G. Cochran, 
G. M. Cox*, B. B. Day, H. Hotelling, J. Ipsen, Jr., G. Karreman, 
H. W. Kloepfer, K. Kopf, I. M. Lerner, D. C. Lowry, E. Lukaes, J. L. 
Lush, H. H. Smith; Uruguay—G. J. Fischer. 
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QUERIES 


GrorcE W. SNEDEcOR, Editor 


QUERY: It is required to find whether or not y depends on the 
104 x, , x. , x3, 2,andz;. One expects to find a relationship 

for y depending only on those factors to each of which can be 
attributed a significant proportion of the variability of the y’s when those 
factors: are considered together. For example, table (1) shows that y 
depends on zx, and zx, , and also on x; when 2; and x, are taken into 
account. If one decides that x; and x, have no effect on y then, from 
table (2), x; also does not affect y. Table (3) then shows that only z, 
accounts for a significant proportion of the variability in the y’s. The 
final analysis is given in table (4). This assumes that the regression on 
,%3, 2, and x; is zero. Omitting the “non-significant” variables will 
then lead to a biased estimate of the regression on x, and the analysis of 
variance tables show that the estimate of error is inflated. With a larger 
number of observations the error would not be inflated to the same ex- 
tent. The initial hypothesis determines the method of analysis. In the 
example quoted the procedure adopted is suspect but would it be justified 
if there were considerably more observations? 


(1) 
df. 8.8. m.s. 

Regression on 2, —_ 1 128627*** 

after z, :. 1 28891* 

Remainder . 8 35659 4457.4 


223860 
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QUERIES 


(2) 
df. 8.8. 
Regression on 2, « % 1 128627*** 
after z, . 1 28891* 
Remainder . 8 35659 
223860 
(3) 
d.f. 
Regressiononz, .......... 128627*** 
1 28891 
223860 
(4) 
df. 8.8. 
‘ Your procedure is incorrect because only one of the five 


ANSWER: mean squares each with one degree of freedom in Table I 


provides a test of significance of a particular aspect of the 
null hypothesis. The model in the present situation is 


Y = at Bix, + Bot. + Bst3 + Bits + Bsts + €, 


the e’s having the usual Gaussian properties. Your Table I provides one 
exact test of significance, namely of the null hypothesis that 8; is zero, 
for which F with 1 and 8 degrees of freedom is 25784 /4457.4. We con- 
clude then that y depends on 2; . 


We can derive from your Table I other valid tests of significance. 
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For instance, we can obtain the following analysis of variance: 


Due to d.f. 8S. Sqs. M. Sq. 
Regression onz, .. . 128627 


Regression on , , , alter taking 


account of zx, 59574 14,893.5 
| 35659 4,457.5 
223860 


This gives a combined test of significance of 8. , 8; , 8, and8;. The F 
ratio is 3.34, whereas the 5% point of the F distribution is 3.84. Accord- 
ing to this test then we conclude that y does not depend on x, , x; , x, and 
z;. We could also construct e.g. an analysis for testing x, and z; jointly. 

The analysis of variance given above cannot be used to test the sig- 
nificance of the dependence of y on x, . To make this test we must con- 
struct the analysis of variance: 


Due to df. S. Sqs. M. Sq. 
Regression on 22 ,2%3,2%4,2%5 . 4 (not given) 
Regression on 2, after taking account, of 
35659 4457.4 


The test consists of comparing M /4457.4 with the F distribution with 1 
and 8 d.f. If the verdict is significant, then pro tempore on the basis of 
the test and the previous one, we would state that y depends only on 7, . 
The possible bias in a predictor based on z, only can be written down 
from the formula given below. 

The reader will have already noticed one inconsistency in the con- 
clusions that are drawn, namely the test described first indicates that 
y depends on x; , while the third test indicates that y does not depend on 
z;. To elucidate this matter further, it is necessary to note that the 
three tests given above are really giving evidence on the following: 


first test: | does use of x; add precision to our prediction of y based on 
73 and x, ? 

second test: does use of x, , 23 , x, and x, add precision to our prediction 
of y based on zx, alone? 

third test: does use of x, add precision to our prediction of y based on 


Ze , % and ? 
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Tests like the first and the third can be made by the analysis of variance 
as indicated above, or can be made by (-tests. 

The f-test procedure was given by Fisher (J. Roy. Stat. Soe., 85, 611) 
and is mentioned in Statistical Methods for Research Workers, Section 
29. Separate i-tests (or F tests) on single independent variables are cor- 
related because the same estimate of error is used in all. The joint test, 
exemplified by the test of x, , 23 , x, and x; above, removes this correla- 
tion effect. This test was devised many years ago and is the one used 
for example in the analysis of covariance. A description can be found in 
Kempthorne’s Design and Analysis of Experiments, pp. 44-47, for 
example. 

The inconsistency noted above is the usual one which arises when 
several contrasts are tested jointly. One contrast may be significant 
judged alone, but its magnitude becomes diluted by other contrasts of 
small magnitude in a joint test. 

However, another type of inconsistency is of frequent occurrence, 
which may be exemplified as follows. It may happen that in the obser- 
vational set up, as specified by the array of values for x, , x2 , 73 , 2, and 
x; , the values of some of the independent variables, say x, and zx; are 
highly correlated. As a result Fisher’s test will state that x; is of no 
value, which really means that x; adds no precision to the prediction of y 
given that the predictor contains x, , x, , 2; , and 2, ; also the test on 2, 
will yield non significance, meaning that x, adds no precision to the pre- 
diction of y given that the predictor contains 2, , 2. , x; and x; . However, 
when one makes the joint test of x, and x; , which really asks whether x, 
and zx, add precision to the prediction of y, given that the predictor con- 
tains 2, , 22, and x; , we may find that «, and z; jointly do add precision. 
This difficulty can be bypassed in experimental situations by judicious 
selection of the values of x, , x2 , x3 , x, and x; for which y is observed, i.e. 
by aiming at orthogonality. 

In the absence of a priori knowledge, it is therefore necessary to make 
a sequence of significance tests and the interpretation to be made of suc- 
cessive members in the sequence is not clear cut. Nor is it at all clear 
how one should overcome the difficulty of almost complete nonortho- 
gonality. Certainly it seems reasonable first to test each regression 
coefficient separately by the Fisher test. Then one might omit the inde- 
pendent variable which is least significant among the non-significant 
variables, and start again as though this omitted variable had not been 
observed. More optimum procedures may well be available and the 
present author would be interested to hear of them. A difficult situation 
is examined from a point of view much like the present one by Fisher 
(Proc. Roy. Soc. B, 126, 25-29). 
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The procedure suggested will lead to a conclusion to which exact 
probabilities cannot be attached. However, it is reasonable to take the 
point of view that the conclusion reached by this method, which is 
objective even though its operating characteristics are not known, should 
be regarded as tentative and the validity of the omission of the eliminated 
variables from the predictor should be tested on a new set of data from 
the same source. Perhaps the best that can be done, given a set of data, 
is to divide the data into two sets at random using one set to estimate the 
dependency relation and the other set to test the validity of this discov- 
ered relation. 

It is perhaps worth commenting that the problem can perhaps be 
viewed as one of the choice of a predictor of y and then the standard 
facets of decision theory, costs, risks etc. would have to be incorporated. 

Also it is worth adding that some other procedure perhaps based on 
principal components would have to be used if we have a large number 
of independent variables. 

The situation is simpler if we have a priori knowledge. For example, 
we may have good reason to believe that y depends on x, , x, and x; and 
does not depend on z, and x;. We may then use the general test pro- 
cedure described above to test the accuracy of our a priori knowledge. 
The analysis for this test in the present situation is as follows: 


Due to dfs S. Sqs. M. Sq. 
Regression on 2%, ....... 8 

Regression on x, , x; after taking account 

2 (by subtraction) M 
M 
= 4574 with 2 and 8 df. 


Even if we accept the validity of our prior knowledge it is probably wise 
to use as error ean square that based on deviations from the regression 
of y on all variables to insure that it is not biased. 

The bias in regression coefficients estimated after the elimination of 
some variables may be written down easily (see Kempthorne p. 59). 
Denote the full model by 


where y, X, , X2, y, 6, e are matrices, y, y, 6, and e being column ma- 
trices, y and 6 being the arrays of regression coefficients. Then the bias 
in y resulting from assuming 6 to be zero is equal to 
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where X; is the transpose of X, , and 4 is the true value of 6, which is 
erroneously taken to be zero. For example if we ought to be fitting 


y = Bix, + Box, 
but in fact fit 
y = Bix, 
the bias in £, is 


It is also worth noting that in writing down a model such as 


= Bo + Bit, + Brot. + +++ + 


one has already possibly introduced bias by omitting variables neces- 5) 
sarily because they were unobserved. 


O. KEMPTHORNE 


QUERY: | havea split-split plot experiment with 4 replications. 
105 I have been unable to find a formula for the standard error of the 

difference between means of treatments at different levels of the 
splits. Can you help me? 


Assume we have r replications, a whole-plot treatments, 8 
sub-plot treatments and y sub-sub-plot treatments; the 
error mean squares are FE, for the whole-plots, EF, for the 
sub-plots and £, for the sub-sub-plots. The estimated standard error of 


the difference between the mean yields of treatments (a@ob oc.) and 
(a,b,¢;) is 


ANSWER: 


rBy 
The same result holds for any comparison with different whole-plot , 
treatments, e.g. (@oboCo) vs. either (a,b,co) or or 
However, if you want to compare (dob co) with either (ab,c,) or 
(aob,¢o), the estimated standard error is 


2 
DEL, 


and if you change only the sub-sub-plot treatment, e.g. (@oboco) vs. 
(aoboc,), the estimated standard error is 


V 2E./r 
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I would like to emphasize that these are average standard errors and 
are not intended to be used to compare means selected on the basis of the 
results in the given experiment. For example, you should not use these 
standard errors to compare the two most divergent means in this experi- 


ment. 
R. L. ANDERSON 


CORRECTION FOR QUERY 100, Vou. 9 (1953), PAGE 253. 


There was a misunderstanding between Dr. Irwin and me concerning 
the presence or absence of interaction. The method furnished by Dr. 
Irwin gives unbiased estimates, though not the best ones, if interaction 
is absent; it is the appropriate method to use if interaction is assumed 
present as in table 11.24 of my fourth edition. In fact, on the assumption 
of interaction in the population, this method is the one actually applied 
in table 11.25. 

But under the assumption of no interaction in the population, the 
correct method is the one given in table 11.22, with no alteration. This 
means that the method described in Query 100 was inadvertently applied 
to table 11.22. As the printers would say of this table, “Stet.” 

Dr. Irwin and I join in regret that our misunderstanding was not dis- 
covered before the query was printed. 

GrEorGE W. SNEDECOR 
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THE BIOMETRIC SOCIETY 


1953 Directory. Each member of the Society should have received 
his free copy of our new 1953 Directory before this number of Biometrics 
reaches him. It lists all members of the Society as of December 31, 1952, 
plus all members enrolled between then and July, 1953, when the Direc- 
tory went to press. Our first Directory in 1949 listed 902 members and 
the new Directory, four years later, 1152 members, representing a net 
increase of 25 per cent even though 40 per cent of those listed in 1949 
no longer belong to the Society. Geographically, our members live in 
fifty different countries, with no country having an absolute majority. 
Growth has been most notable in Belgium, Germany, Japan, and on the 
African continent. In addition to the alphabetical membership list 
and a geographical summary, the Directory includes a list of the general 
and regional officers of the Society since its founding, the Society 
Constitution and the Council By-Laws. Additional copies of the Direc- 


tory may be purchased through the Office of the Secretary-Treasurer in 
New Haven. 


I.U.B.S. The International Union of Biological Sciences held its 
Eleventh General Assembly in Nice, France on August 17-21, with 
President H. Munro Fox, Professor of Zoology in the University of 
London, presiding. The Biometric Society is one of the nine sections of 
the I.U.B.S., the other sections being Botany, Cytology, Embryology, 
Entomology, Genetics, Limnology, Microbiology and Zoology. The 
Section of Biometry was represented by C. I. Bliss, Sir Ronald Fisher, 
M. Lamotte and A. Linder. The opening address by President Fox was 
followed by reports for the period 1950-53 by the Secretary General of 
the Union, Professor P. Vayssiere, by a representative of each section, 
sub-section, commission and joint commission, and by the Treasure: of 
the Union, Professor F. Chodat. These reports, including that on Biom- 
etry by the Secretary of the Society, will be published in the Proceedings 
of the Assembly. 

New regulations for the Union were adopted and activities for the 
period 1953-56 reviewed. Among the symposia accepted by the Assem- 
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bly was a proposal by the Section of Biometry for a biometric symposium 
in Brazil in July, 1955. Another symposium in Brazil in 1955, on ento- 
mology, was also approved. Since the International Statistical Institute 
will hold its 29th session in Brazil in the summer of 1955, the biometric 
program will be coordinated with these allied international meetings to 
their mutual advantage. Plans were proposed for enlarging the activities 
and income of the Union, so that they would be more nearly commen- 
surate with its scope. The following general officers were elected for the 
period 1953-56: President, S. O. Horstadius, Uppsala; Vice-President, 
R. E. Cleland, Washington, D.C.; Secretary General, G. Montalenti, 
Naples; Secretary, R. Ulrich, Paris; Treasurer, A. Linder, Geneva. 
Drs. Montalenti and Linder are both members of The Biometric Society. 


I.S.I. The 28th Session of the International Statistical Institute, 
with which the Society is affiliated, convened in Rome at F.A.O. head- 
quarters on September 6-12, with many members of the Society in 
attendance. The Session will be remembered not only for its scientific 
programs, but also for the magnificence of the entertainment provided 
by the Organizing Committee, and the interest in the Institute shown by . 
our hosts. During the week participants in the Session were welcomed 
by Prime Minister Pella, addressed in a special audience by Pope Pius 
XII, and received at the Quirinale Palace by President Einaudi, himself 
a member of the Institute for twenty-five years. Following the Session 
the Institute sponsored a Seminar on September 14-17 as part of its 
statistical education program. Many of the lecturers on mathematical 
and industrial statistics and on sample surveys were members of the 
Society. 


Région Frangaise. A la séance de la Région, le 20 Mai, 1953, 4 Paris, 
le Professeur G. Malecot a parlé sur ‘Migration et évolution des popu- 
lations”, et M. M. Lamotte sur “Essai d’emploi de quelques méthodes | 
mathématiques en génétique des populations (fin)’’. 


Japanese Section. The first meeting was held on August 8, 1953 at 
the National Institute of Agricultural Sciences, Tokyo, the attendance 
being about fifteen. Various business matters were discussed. It was 
decided to hold at least two general meetings a year, and special meetings 
in Tokyo about four times a year or as required. The spring general 
meeting will be held in connection with the general meeting of the 
Japanese Society of Agronomy. The autumn generai meeting will be 
held in connection with that of the Japanese Society of Mathematical 
Statistics. It was proposed that a summary of the biometric studies 
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conducted in Japan during the year should be presented at one of the 
general meetings. 


German Section. The German members of the Society held their 
first session at the Paul Ehrlich Institute in Frankfurt am Main on Sep- 
tember 22. Dr. Maria-Pia Geppert, National Secretary for Germany, 
opened the afternoon business meeting with a report on the Bellagio 
Conference. Society activities in Germany were discussed, including 
arrangements for the publication of biometric articles in Germany, a 
listing of members, both present (now numbering twenty-five) and pro- 
spective, and plans for the next meeting of the section in January. In 
the evening members of the Society, meeting jointly with the Biolo- 
gischer Verein c.V., were addressed by Professor A. Linder of the Uni- 
versity of Geneva on “Experimental Design’ and by Dr. C. I. Bliss of 
New Haven, Conn. on “Two Examples of Covariance’’, with more than 
seventy-five in attendance. An active discussion continued until after 
midnight. 


Netherlands Section. Members of The Biometric Society in the Neth- 
erlands met jointly with the Medical-biological Section of the Nether- 
lands Statistical Society and the Biometric Section of the Netherlands 
Society for Agricultural Science at the University of Utrecht on Sep- 
tember 25, with thirty-six attending. In the morning C. I. Bliss de- 
scribed the role of covariance in solving a medical and an agricultural 
problem, and in the afternoon P. L. F. de Jong discussed the prediction 
of the size of the strawberry crop in the Netherlands in 1953 and 1954 
from the results of previous years, weather records and economic factors. 
The meeting closed with reports on the Third International Biometric 
Conference in Bellagio by E. van der Laan and on the 28th session of the 
International Statistical Institute in Rome by E. F. Drion. 


Belgian Region. A l’invitation du Professeur P. Spehl, Monsieur le 
Professeur C. I. Bliss, Secrétaire de la ‘Biometric Society’’, a donné une 
conférence trés remarquée 4 la tribune de la Société Adolphe Quetelet, 
Association des Biométriciens de Belgique et du Congo Belge. Cette 
séance s’est tenu dans les locaux de la Faculté de Médecine de |’Uni- 
versité de Bruxelles, le 29 septembre dernier. La conférence de Monsieur 
Bliss, qui fut suivie d’une discussion, portait sur le sujet suivant: ‘The 
solution of a medical and of an agricultural experiment with covariance’. 
Une trentaine de membres de la Société Adolphe Quetelet ont applaudi 
le conférencier. 


Vis 
i 
ane 
§ 


NEWS 


Cooperative Graduate Summer Sessions in Statistics 


Beginning in 1954 North Carolina State College, the University of 
Florida, Virginia Polytechnic Institute and the Southern Regional Edu- 
cation Board will jointly sponsor cooperative Graduate Summer Sessions 
in Statistics. 

The first session will be conducted by a distinguished faculty at 
Virginia Polytechnic Institute in the summer of 1954. Additional 
summer sessions are tentatively planned for North Carolina State 
College and the University of Florida in the two following years. Subse- 
quent sessions will be rotated among these or other institutions through- 
out the South. 

The summer sessions are designed to carry out a recommendation of 
the Southern Regional Education Board’s Commission on Statistics, on 
which the three institutions initiating the program are represented. 
They will be of particular interest to (1) research and professional 
workers who want intensive instruction in basic statistical concepts and 
who wish to learn modern statistical methodology; (2) teachers of 
elementary statistical courses who want some formal training in modern 
statistics; (3) prospective candidates for graduate degrees in statistics; 
(4) graduate students in other fields who desire supporting work in 
statistics; and (5) professional statisticians who wish to keep informed 
of advanced specialized theory and methods. 

Each of the summer sessions will last six weeks and each course will 
carry three semester hours of graduate credit, with a maximum of six 
semester hour credits earned in one summer. The courses are arranged 
to enable the person to take consecutive work in successive summers. 
The summer work in statistics may be applied at any one of the cooperat- 
ing institutions in partial fulfillment of the requirements for a Master’s 
degree. The catalog requirements for the degree must be met at the 
degree-granting institutions. Each Doctoral candidate should consult 
with the institution from which he desires to obtain the degree regarding 
the applicability of the summer courses in statistics. 

During the first session Professor Maurice Kendall of the University 
of London will give a course in Multi-variate Analysis, and Dr. Ralph 
Comstock of North Carolina State College will give one in Quantitative 
Genetics. 
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The staff of the Virginia Polytechnic Institute’s Department of 
Statistics will offer such courses as Probability and Inference, Analysis 
of Variance, Statistical Methods, Engineering Statistics, Education 
‘Statistics, Rank Order Statistics and the Theory of Sequential Methods. 

The department includes R. A. Bradley, D. B. Duncan, M. C. K. 
Tweedie, P. M. Somerville and Boyd Harshbarger. In addition, other 
outstanding statistical scholars will direct special afternoon seminars. 
The agricultural, science and engineering divisions of the College will 
make available advanced courses for students who wish to supplement 
their work in statistics. 

The Virginia Polytechnic Institute is located at Blacksburg in the 
scenic Allegheny Mountains. The summer climate is delightful. 

The fee for the Virginia Polytechnic Institute session is $30.00. 
Board, room, post office box and laundry for the entire session may be 
had for $76.40. The session will run from June 9 through July 17, 1954. 

Inquiries should be addressed to Boyd Harshbarger, Head, Depart- 
ment of Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia. 


Special Statistics Session for Research Engineers, Physicists and Chemists 


During the Spring quarter of 1954 (March 24 to June 4) the Institute 
of Statistics of the University of North Carolina will sponsor a special 
program of course work, lectures and seminars on statistics for research 
engineers, physicists and chemists. The primary objective of this 
program is to provide an opportunity for industrial research workers 
to acquire a working knowledge of modern statistical concepts and 
techniques. Emphasis will be on the efficient design of experiments 
and the analysis of data therefrom. Informal siminars on statistical 
problems submitted by the participating students will be held. Guest 
lecturers will include Dr. W. J. Youden and Dr. M. G. Kendall. Regular 
college credit will be granted for course work satisfactorily completed. 
For further information write to Institute of Statistics, North Carolina 
State College, Box 5457, Raleigh, N. C. 


Meeting of The Biometric Society, Gainesville, Florida 


Titles and abstracts for contributed papers for The Biometric 
Society meeting to be held at the University of Florida, March 17, 18, 
19 and 20, 1954, should be sent to Boyd Harshbarger, Department of 
Statistics, Virginia Polytechnic Institute, Blacksburg, Virginia, not 
later than February 20, 1954. 
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Sociology, 283 
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Spectrography, 135 
Split plots, 31, 157, 298, 429, 533 is 
Standard deviation, 182, 293, 400 and f 


see variance 
Stochastic model, 212, 218 
Sufficient statistic, 463 
Survival curve, 201 
Switchback design, 430, 431 
Tables, 89, 324, 470, 490, 514 
Taste testing, 1, 22, 39, 264 
Taxonomy, 117, 176 
Teaching statistics, 518 
Tests of significance, 11, 45, 84, 337, 
398, 414, 473, 529 and see chi 
square, F test, goodness of fit, 
likelihood ratio, non-parametric, 
sign test, t test 
combination of, 16, 255 
multiple, 262 
Thomas double Poisson distribution, 
192 
Time-response curve, 445 
Time series, 264 
Transformations, 47, 50, 191, 209, 290, be 
384, 427, 467 : 
Tree crops, 429 
Trends, 260, 304, 339, 429 
Triangle test, 24, 43 
t test, 16, 20, 32, 66, 109, 110, 210, 314, 
530 
Twins, 265 
Uniformity trials, 412 
Variance, 79, 94, 176, 181, 191, 202, 
221, 226, 267, 294, 321, 447, 486, fb 
508 and see standard deviation 
asymptotic, 65, 207 261, 448, 485 ze 
conditional, 65 a 
error, 256, 473 
homogeneity of, 11, 61, 210, 468 
matrix, see covariance A 
minimum, 92 Fa 
of difference, 148, 312, 533 F 
of mean, 59, 78, 173, 181, 448 i 
of median, 75 
sampling, 463 
Variance components, 66, 226, 262, 337 
Variance ratio, 12, 147, 473 and see 
F test, incomplete beta function 
Weighted mean, 59 
Weights, 34, 59, 91, 267, 447, 468, 496 
Yates’ correction, 110 
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